Gene BED Module¶
Gene coordinate processing and BED file generation
Gene BED extraction and gene normalization module.
This module provides: - normalize_genes: For normalizing gene inputs (single gene, multiple genes, or file). - get_gene_bed: For generating a BED file corresponding to specified genes via snpEff genes2bed.
- variantcentrifuge.gene_bed.normalize_genes(gene_name_str, gene_file_str, logger)[source]¶
Normalize genes from either a single gene name, a list of genes, or a file.
If ‘all’ is provided or no genes after filtering, returns “all”.
- Parameters:
gene_name_str (str or None) – The gene name(s) provided via CLI (can be a single gene or space/comma-separated).
gene_file_str (str or None) – Path to a file containing gene names, one per line.
logger (logging.Logger) – Logger instance for logging messages.
- Returns:
A normalized, space-separated string of gene names, or “all”.
- Return type:
- Raises:
SystemExit – If no gene name or file is provided, or if the specified file does not exist.
- variantcentrifuge.gene_bed.get_gene_bed(reference, gene_name, interval_expand=0, add_chr=True, output_dir='output')[source]¶
Generate a BED file for the given gene(s) using snpEff genes2bed.
If gene_name == “all”, the command runs without specifying genes. If multiple genes are provided, they are passed as arguments.
- Parameters:
reference (str) – The reference genome name compatible with snpEff.
gene_name (str) – “all” or space-separated list of gene names.
interval_expand (int, optional) – Number of bases to expand upstream/downstream of the gene regions.
add_chr (bool, optional) – Whether to add a ‘chr’ prefix to chromosome names in the BED file.
output_dir (str, optional) – Directory to store cached BED files. Default is “output”.
- Returns:
Path to the final BED file.
- Return type:
- Raises:
subprocess.CalledProcessError – If the snpEff genes2bed or sorting command fails.