Skip to content

BCFtools

BCFtools is the standard toolkit for manipulating VCF and BCF variant files. Liatir uses the stats subcommand to compute summary statistics from a variant callset.

Details

PropertyValue
TypeNative tool
Binarybcftools
Subcommandstats

Installation

BCFtools must be installed and available in your system PATH.

bash
brew install bcftools
bash
sudo apt install bcftools
bash
conda install -c bioconda bcftools

Accepted inputs

ExtensionDescription
.vcfUncompressed VCF
.vcf.gzBgzip-compressed VCF (requires .tbi index for random access; not needed for stats)
.bcfBinary call format
.bcf.gzCompressed BCF

VCF vs BCF

BCF is the binary equivalent of VCF, roughly analogous to BAM vs SAM. BCF files are faster to parse and smaller on disk but require BCFtools or htslib to inspect. For bcftools stats, either format works identically.

Running BCFtools stats

  1. Navigate to Tools → BCFtools.
  2. Select a VCF or BCF file from your Data library.
  3. Click Run.

The run executes bcftools stats <path> and parses the structured output sections.

Output metrics

Summary counts

MetricDescription
Total recordsAll variant records in the file
SNPsSingle-nucleotide polymorphisms (REF and ALT are both single bases)
IndelsInsertions and deletions
MNPsMulti-nucleotide polymorphisms
OtherComplex variants not in the above categories
Multiallelic sitesSites with more than one ALT allele

Transitions and transversions

Transitions (Ts) are substitutions between chemically similar bases: A↔G (purines) and C↔T (pyrimidines). Transversions (Tv) are substitutions between dissimilar bases: A↔C, A↔T, G↔C, G↔T.

Biological mutation rates favour transitions over transversions, so the Ts/Tv ratio is a standard quality indicator:

Ts/Tv ratioContext
≥ 2.8Expected for whole-exome sequencing
~2.0–2.1Expected for whole-genome sequencing
≥ 1.8Generally acceptable for WGS
< 1.8May indicate false positives or low-quality calls

Low Ts/Tv

A Ts/Tv below 1.8 often signals a high false-positive rate in the callset. This typically appears when variant caller quality thresholds are too permissive, or when the sample has very low coverage. Consider tightening QUAL or FILTER thresholds before downstream analysis.

Per-sample statistics (multi-sample VCF)

When the input VCF contains multiple samples, bcftools stats reports per-sample counts:

MetricDescription
Homozygous refSites where the sample is homozygous reference
Homozygous altSites where the sample is homozygous alternate
HeterozygousSites where the sample is heterozygous
Ts/Tv (per sample)Per-sample transition-to-transversion ratio

High variability in per-sample Ts/Tv within the same cohort may indicate technical differences between samples (e.g., different sequencing batches or coverage levels).

Liatir — powerful bioinformatics on your machine.

By using this app, you agree to our Privacy Policy and Terms of Service.