Welcome to VariantCentrifuge!¶
VariantCentrifuge is a Python-based command-line tool designed to filter, extract, and refine genetic variant data (VCF files) based on genes of interest, rarity criteria, and impact annotations. Built with modularity and extensibility in mind, VariantCentrifuge replaces the complexity of traditional Bash/R pipelines with a cleaner, maintainable Python codebase.
Key Features¶
Gene-Centric Filtering: Extract variants from regions defined by genes of interest
Rare Variant Identification: Apply custom filters to isolate rare and moderate/high-impact variants
Flexible Field Extraction: Easily specify which fields to extract from the VCF
Genotype Replacement: Replace genotype fields with corresponding sample IDs
Phenotype Integration: Integrate phenotype data for enhanced variant analysis
Gene List Annotation: Annotate variant outputs with gene membership information
Comprehensive Analysis: Perform gene burden analyses and variant-level statistics
Rich Reporting: Generate interactive HTML reports with IGV.js integration
Quick Start¶
# Basic usage
variantcentrifuge \
--gene-name BICC1 \
--vcf-file input.vcf.gz \
--output-file output.tsv
# With custom filters and Excel output
variantcentrifuge \
--gene-name BICC1 \
--vcf-file input.vcf.gz \
--filters "((dbNSFP_gnomAD_exomes_AC[0] <= 2) & (ANN[ANY].IMPACT has 'HIGH'))" \
--xlsx
Documentation Contents¶
User Guide
API Reference
External Dependencies¶
VariantCentrifuge requires these bioinformatics tools to be installed and available in your PATH:
bcftools - VCF manipulation
snpEff - Functional annotation and BED file generation
SnpSift - Variant filtering and field extraction
bedtools - BED file operations
Getting Help¶
Issues: Report bugs and request features on GitHub Issues
Discussions: Join the conversation by creating an issue
Documentation: Browse the complete documentation on this site
License¶
This project is licensed under the MIT License.