cellink.tl.run_snpeff#
- cellink.tl.run_snpeff(command='snpEff', genome_assembly='GRCh38.104', input_vcf='variants.vcf', output='variants_snpeff_annotated.txt', return_annos=True, **kwargs)#
Annotates variants using the SnpEff command-line tool.
Requires SnpEff to be installed and a valid genome database to be specified in the config. If you choose to install Snpeff via conda, the command should be snpeff; If you choose to install Snpeff via java .ar download, this should be the path to the snpeff jar file, e.g. java -Xmx8g -jar /opt/miniconda3/envs/ENV_NAME/share/snpeff-5.2-1/snpEff.jar
- Parameters:
input_vcf (str) – Path to the input VCF file containing variants to annotate. Defaults to “variants.vcf”.
output (str) – Path to the file where the annotated variants will be written. Defaults to “variants_vep_annotated.txt”.
return_annos (bool) – Whether to return the annotations as a Pandas DataFrame after writing them to disk. Defaults to False.
**kwargs (dict) – Additional keyword arguments to be passed as command-line options to SnpEff. These are formatted as –key value.
- Returns:
None or pandas.DataFrame Returns None if return_annos is False. If True, returns a DataFrame of the annotations loaded from the output file.