Skip to content

Samtools

Samtools is the standard toolkit for working with sequence alignment data. Liatir uses the flagstat subcommand to produce a quick summary of alignment statistics from BAM, SAM, or CRAM files.

Details

PropertyValue
TypeNative tool
Binarysamtools
Subcommandflagstat

Installation

Samtools must be installed and available in your system PATH.

bash
brew install samtools
bash
sudo apt install samtools
bash
conda install -c bioconda samtools

Liatir checks for samtools on page load. If it is not found, the dependency card shows these same instructions.

Accepted inputs

ExtensionDescription
.bamBinary alignment map (the standard format)
.samText alignment map
.cramCompressed, reference-based alignment format

BAM is the most common format. SAM files are large (text) and uncommon in practice; most pipelines write BAM directly.

BAM indexing

samtools flagstat reads the entire file sequentially — it does not need a .bai index. Other samtools operations (like view -r for region queries) do require an index.

Running Samtools flagstat

  1. Navigate to Tools → Samtools.
  2. Select a BAM, SAM, or CRAM file from your Data library.
  3. Click Run.

The run spawns samtools flagstat <path> and captures stdout. Results are parsed and stored in run history.

Output metrics

samtools flagstat counts reads in each category defined by the SAM bitwise FLAG field. Each count is reported twice: reads that pass QC filters and reads that fail them (the latter shown in parentheses).

MetricFLAG bitsDescription
Total readsAll records in the file (QC-passed + QC-failed)
Mapped0x4 unsetReads with at least one reported alignment
Mapped %Mapped / total × 100
Duplicates0x400Reads marked as optical or PCR duplicate
Properly paired0x2Both mates aligned, within expected distance and orientation
Secondary0x100Alternative alignments for multi-mapping reads
Supplementary0x800Chimeric alignments (e.g., from structural variant evidence)
Paired in sequencing0x1Reads from paired-end library
Read 10x40First mate in a pair
Read 20x80Second mate in a pair

Interpreting results

MetricGood rangeConcern
Mapped %≥ 95% (WGS human)< 80% may indicate wrong reference, contamination, or damaged library
Properly paired %Close to mapping rateLarge gap may indicate structural rearrangements or library quality issues
Duplicate %< 20% (WGS)> 40% suggests over-amplification; may bias variant calling and coverage
SecondaryLow (< 5%)High secondary counts mean many reads map to multiple locations

Duplicate rate context

Duplicate percentages are expected to be higher for:

  • Amplicon sequencing (by design)
  • Very low-input libraries
  • High-depth targeted panels

For whole-genome or whole-exome, 10–25% is typical. Above 40% is worth investigating.

Liatir — powerful bioinformatics on your machine.

By using this app, you agree to our Privacy Policy and Terms of Service.