Installation¶
Dependencies¶
External Tools¶
VariantCentrifuge requires several bioinformatics tools to be installed and available in your PATH:
bcftools - For variant extraction and manipulation
snpEff - For generating gene BED files and functional annotations
SnpSift - For filtering and field extraction
bedtools (specifically
sortBed) - For sorting BED files
Installing via mamba/conda¶
mamba create -y -n annotation bcftools snpsift snpeff bedtools
mamba activate annotation
Ensure these tools are in your PATH before running VariantCentrifuge.
Python Requirements¶
VariantCentrifuge requires Python 3.7+ and the following Python packages:
pandas- For XLSX conversion and data handlingjinja2- For HTML template renderingopenpyxl- For XLSX creationscipy- For Fisher exact test in variant analysisstatsmodels- For multiple testing correction in gene burden analysis
Installation Methods¶
Method 1: Install from PyPI (Recommended)¶
pip install variantcentrifuge
Method 2: Install from Source¶
Clone the repository:
git clone https://github.com/scholl-lab/variantcentrifuge.git cd variantcentrifuge
Set up a virtual environment (recommended):
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
Install the package:
pip install .
For development (editable install):
pip install -e .
Method 3: Using conda environment¶
# Clone the repository
git clone https://github.com/scholl-lab/variantcentrifuge.git
cd variantcentrifuge
# Create and activate conda environment
mamba env create -f conda/environment.yml
mamba activate annotation
# Install the package
pip install -e .
Method 4: Docker (Recommended for Quick Setup)¶
The Docker image bundles all external tools (bcftools, snpEff, SnpSift, bedtools) and Python dependencies into a single container. No local installation of bioinformatics tools is required.
# Pull the latest image from GitHub Container Registry
docker pull ghcr.io/scholl-lab/variantcentrifuge:latest
# Verify the installation
docker run --rm ghcr.io/scholl-lab/variantcentrifuge:latest --version
Place your VCF files in a local directory and mount it into the container:
docker run --rm -v ./data:/data \
ghcr.io/scholl-lab/variantcentrifuge:latest \
--gene-name BRCA1 \
--vcf-file /data/input.vcf.gz \
--output-file /data/output.tsv
A docker-compose.yml is included in the repository for convenience:
services:
variantcentrifuge:
image: ghcr.io/scholl-lab/variantcentrifuge:latest
volumes:
- ./data:/data
# Mount snpEff databases (download once, reuse)
# - /path/to/snpeff_data:/snpeff_data:ro
# Override built-in scoring configs
# - ./my_scoring:/app/scoring:ro
docker compose run --rm variantcentrifuge \
--gene-name BRCA1 --vcf-file /data/input.vcf.gz --output-file /data/output.tsv
The image runs as a non-root user, includes built-in scoring models at /app/scoring/, and is signed with cosign for supply chain security.
Verification¶
Verify your installation by running:
variantcentrifuge --version
And check that external tools are available:
bcftools --version
snpEff -version
java -jar $SNPSIFT_JAR
bedtools --version
Troubleshooting¶
Common Issues¶
External tools not found:
Ensure all external tools are installed and in your PATH
For conda installations, activate the correct environment
Permission errors:
Use
--userflag with pip:pip install --user variantcentrifugeOr use a virtual environment
Version conflicts:
Use a fresh virtual environment or conda environment
Update pip:
pip install --upgrade pip
Environment Variables¶
You may need to set environment variables for some tools:
export SNPEFF_JAR=/path/to/snpEff.jar
export SNPSIFT_JAR=/path/to/SnpSift.jar
Next Steps¶
Once installed, proceed to the Usage Guide to learn how to configure and run VariantCentrifuge.