cellink.tl.external.run_jaxqtl#
- cellink.tl.external.run_jaxqtl(dd, prefix=None, out=None, n_pcs=50, add_covar=None, covar_test=None, rm_covar=None, model='NB', mode='cis', ld_type=None, platform=None, test_method='score', window=500000, nperm=1000, max_iter=None, perm_seed=None, addpc=2, prop_cutoff=None, express_percent=None, offset=None, indlist=None, cond_snp=None, robust=False, rare_snp=False, autosomal_only=False, perm_pheno=False, qvalue=False, no_offset=False, standardize=True, statsmodel=False, verbose=False, encode_sex=True, encode_age=True, additional_covariates=None, dtype='float32', run=True, read_results=True, save_cmd_file=False, plink_export_kwargs={}, remove_intermediate_files=True, overwrite_covariates_export=True, overwrite_phenotype_export=True, overwrite_plink_export=True)#
Run cis- or trans-eQTL mapping using jaxQTL on donor-level genotype and aggregated expression data.
This function prepares input files from a
DonorDataobject, builds a command to invoke thejaxqtlbinary, and optionally executes it. Covariates such as age, sex, and additional user-specified variables are encoded and included in the model. Supports multiple modes including nominal testing, permutation-based cis-QTL mapping, trans-QTL mapping, and LD estimation.- Parameters:
dd (DonorData) – Object containing donor-level genotype (
dd.G) and cell-level expression data (dd.C).prefix (str, optional) – Prefix for temporary and output files. If not provided, defaults to “jaxqtl_temp”.
out (str, optional) – Output file prefix for jaxQTL results.
n_pcs (int, default=50) – Number of principal components to compute if not already present in
dd.C.obsm["X_pca"].add_covar (str, optional) – Path to file with additional covariates to include.
covar_test (str, optional) – Covariate to test for inclusion in the model.
rm_covar (str, optional) – Covariate to exclude from the model.
model ({'gaussian', 'poisson', 'NB'}, default='NB') – Statistical model used for QTL testing.
mode ({'nominal', 'cis', 'cis_acat', 'fitnull', 'covar', 'trans', 'estimate_ld_only'}, default='cis') – Analysis mode for jaxQTL.
ld_type ({'raw', 'glm_wt', 'no_glm_wt'}, optional) – Type of linkage disequilibrium estimation to use.
platform ({'cpu', 'gpu', 'tpu'}, optional) – Hardware backend for running jaxQTL.
test_method ({'wald', 'score'}, default='score') – Statistical test to use for association testing.
window (int, optional) – Genomic window (in base pairs) for cis or trans testing. Default is 500,000.
nperm (int, optional) – Number of permutations to perform for empirical FDR estimation.
max_iter (int, optional) – Maximum number of iterations for model fitting.
perm_seed (int, optional) – Seed for permutation reproducibility.
addpc (int, optional) – Number of genotype PCs to include as covariates.
prop_cutoff (float, optional) – Minimum proportion of cells required to include a gene.
express_percent (float, optional) – Minimum percentage of donors expressing a gene for inclusion.
offset (str, optional) – Path to offset vector for GLM models.
indlist (str, optional) – File containing list of individual IDs to include.
cond_snp (str, optional) – File with SNPs to condition on in the analysis.
robust (bool, default=False) – If True, enables robust standard error estimation.
rare_snp (bool, default=False) – If True, includes rare variants in the analysis.
autosomal_only (bool, default=False) – If True, restricts analysis to autosomal chromosomes.
perm_pheno (bool, default=False) – If True, permutes phenotypes instead of genotypes.
qvalue (bool, default=False) – If True, calculates q-values for multiple testing correction.
no_offset (bool, default=False) – If True, disables model offset.
standardize (bool, default=True) – If True, standardizes phenotype and genotype data before analysis.
statsmodel (bool, default=False) – If True, uses statsmodels GLM implementation.
verbose (bool, default=False) – If True, prints detailed logging from jaxQTL.
encode_sex (bool, default=True) – If True, adds sex as a categorical covariate using
dd.G.obs['sex'].encode_age (bool, default=True) – If True, adds age as a numeric covariate from
dd.G.obs['age'].additional_covariates (list of str, optional) – Additional covariates to extract from
dd.G.obsordd.G.obsmand include in the model.dtype (str, default='float32') – Data type for numerical covariate matrices.
run (bool, default=True) – If True, executes the jaxQTL command. If False, returns the constructed command as a string.
read_results (bool, default=True) – If True, reads and returns the result files as a pandas DataFrame. If False, returns the path(s) to the output files.
save_cmd_file (str, default=None) – If provided, saves the jaxQTL command to this file instead of printing it.
plink_export_kwargs (dict, optional) – Additional keyword arguments for
to_plinkfunction.remove_intermediate_files (bool, default=True) – If True, removes the intermediate files.
overwrite_covariates_export (bool, default=True) – If True, overwrites the covariates export.
overwrite_phenotype_export (bool, default=True) – If True, overwrites the phenotype export.
overwrite_plink_export (bool, default=True) – If True, overwrites the plink export.
- Return type:
DataFrame|str- Returns:
pd.DataFrame, str, or list[str] If run=True and read_results=True, returns a pandas DataFrame of QTL mapping results. If run=True and read_results=False, returns a list of output file paths. If run=False, returns the constructed jaxQTL command as a string.
- Raises:
ImportError – If jaxqtl is not installed or not found in the system PATH.
ValueError – If required covariates are not found in the DonorData object.