Phenotype Module¶
Phenotype data loading and integration
Phenotype integration module.
This module loads a phenotype file containing sample-to-phenotype mappings and provides a function to aggregate phenotypes for a given list of samples.
The phenotype file must be .csv or .tsv (detected by extension).
The specified sample and phenotype columns must be present in the file.
Phenotypes are stored in a dictionary (sample -> set of phenotypes).
Given a list of samples, phenotypes are aggregated as follows: - For each sample, join multiple phenotypes by “,”. - For multiple samples, join each sample’s phenotype string by “;”.
- variantcentrifuge.phenotype.load_phenotypes(phenotype_file, sample_column, phenotype_column)[source]¶
Load phenotypes from a .csv or .tsv file into a dictionary.
- Parameters:
- Returns:
dict of {str – A dictionary mapping each sample to a set of associated phenotypes.
- Return type:
set of str}
- Raises:
ValueError – If the file is not .csv or .tsv, or if the required columns are not found.
- variantcentrifuge.phenotype.aggregate_phenotypes_for_samples(samples, phenotypes)[source]¶
Aggregate phenotypes for a given list of samples into a single string.
For each sample: - Join multiple phenotypes with “,”. For multiple samples: - Join each sample’s phenotype string with “;”.
- Parameters:
- Returns:
A string aggregating all phenotypes for the given samples, with phenotypes comma-separated per sample, and samples separated by “;”.
- Return type: