cellink.tl.external.run_tensorqtl#
- cellink.tl.external.run_tensorqtl(dd, n_pcs=50, mode=None, permutations=10000, cis_output=None, interaction_df=None, susie_loci=None, window=1000000, pval_threshold=1e-05, logp=False, maf_threshold=0, maf_threshold_interaction=0.05, dosages=False, return_dense=False, return_r2=False, best_only=False, output_text=False, batch_size=20000, chunk_size=None, disable_beta_approx=False, warn_monomorphic=True, max_effects=10, fdr=0.05, qvalue_lambda=None, seed=None, prefix=None, encode_sex=True, encode_age=True, additional_covariates=None, dtype='float32', use_python_api=False, run=True, read_results=True, save_cmd_file=False, plink_export_kwargs={}, remove_intermediate_files=True, overwrite_covariates_export=True, overwrite_phenotype_export=True, overwrite_plink_export=True)#
Run cis- or trans-QTL mapping using TensorQTL on donor-level aggregated expression and genotype data.
- Parameters:
dd (DonorData) – DonorData object containing single-cell gene expression (
dd.C) and donor-level genotype data (dd.G).mode ({'cis_nominal', 'cis_independent',) – ‘cis’, ‘trans’, ‘cis_susie’, ‘trans_susie’}, optional Type of QTL analysis to perform.
prefix (str, optional) – File prefix used for generating intermediate input/output files. Required for most modes.
cis_output (str, optional) – Path to output file for
cis_independentandcis_susiemodes.interaction_df (str, optional) – Path to interaction terms file required for
cis_nominalmode.susie_loci (str, optional) – Path to SuSiE loci file required for
trans_susiemode.permutations (int, default=10000) – Number of permutations used for empirical cis-QTL analysis.
fdr (float, default=0.05) – False Discovery Rate threshold for significant hits in empirical cis-QTL mode.
qvalue_lambda (float, optional) – Lambda parameter for q-value estimation in empirical mode.
window (int, default=1000000) – Genomic window (in base pairs) around phenotype for filtering cis effects.
pval_threshold (float, default=1e-5) – P-value threshold for reporting significant QTL associations.
maf_threshold (float, default=0) – Minimum allele frequency threshold for variants in QTL analysis.
maf_threshold_interaction (float, default=0.05) – MAF threshold for interaction terms in
cis_nominalmode.best_only (bool, default=False) – If True, only report the best association per phenotype (only applies to some modes).
batch_size (int, default=20000) – Number of phenotype-variant pairs processed per batch (important for trans modes).
chunk_size (int or str, optional) – Size of variant chunks processed in cis modes. Can be string like “1M” or integer base pairs.
max_effects (int, default=10) – Maximum number of independent signals to detect in SuSiE-based modes (maps to L parameter).
seed (int, optional) – Random seed for reproducibility, especially for permutation testing.
logp (bool, default=False) – If True, output -log10(p-values) instead of raw p-values.
dosages (bool, default=False) – If True, use dosage data for association testing (if available).
return_dense (bool, default=False) – If True, return dense matrix results (applies to trans-QTL mode).
return_r2 (bool, default=False) – If True, include r² statistics in results.
output_text (bool, default=False) – If True, also output results as text files.
disable_beta_approx (bool, default=False) – If True, disables approximation of beta coefficients.
warn_monomorphic (bool, default=True) – If True, warnings are issued for monomorphic variants.
n_pcs (int, default=50) – Number of principal components to compute from single-cell expression data if PCA not already present.
encode_sex (bool, default=True) – If True, includes donor sex as a covariate.
encode_age (bool, default=True) – If True, includes donor age (z-normalized if needed) as a covariate.
additional_covariates (list of str, optional) – Additional covariates from
dd.G.obsordd.G.obsmto include in the model.dtype (str, default="float32") – Data type to cast covariates and matrices for QTL model input.
use_python_api (bool, default=False) – If True, runs TensorQTL directly via its Python API without exporting intermediate files or invoking a subprocess. Genotypes are loaded from
dd.Gin memory. If False (default), the CLI-based workflow is used, which exports PLINK, phenotype, and covariate files and calls TensorQTL via subprocess.run (bool, default=True) – If True, executes the TensorQTL command. If False, returns the constructed command as a string. Only applies when
use_python_api=False.read_results (bool, default=True) – If True, reads and returns the result files. If False, returns the paths to the output files. Only applies when
use_python_api=False.save_cmd_file (bool, default=False) – If True, saves the constructed TensorQTL command to a file instead of printing. Only applies when
use_python_api=False.plink_export_kwargs (dict, optional) – Additional keyword arguments for
to_plinkfunction. Only applies whenuse_python_api=False.remove_intermediate_files (bool, default=True) – If True, removes the intermediate files. Only applies when
use_python_api=False.overwrite_covariates_export (bool, default=True) – If True, overwrites the covariates export.
overwrite_phenotype_export (bool, default=True) – If True, overwrites the phenotype export.
overwrite_plink_export (bool, default=True) – If True, overwrites the plink export.
- Return type:
Union[DataFrame,Tuple[DataFrame,DataFrame,DataFrame],Tuple[dict,DataFrame],str]- Returns:
pd.DataFrame, tuple, str, or list[str] Depending on mode and read_results: - If use_python_api=True or (run=True and read_results=True): returns pandas DataFrame(s) or tuple of results. - If run=True and read_results=False: returns list of output file paths. - If run=False: returns the constructed TensorQTL command as a string.
- Raises:
ImportError – If required dependencies (
plink2,tensorqtl) are not found in system path.ValueError – If required parameters (
prefix,cis_output,susie_loci) are not provided for the selected mode.