Skip to main content
Ctrl+K

cellink

  • Tutorials
    • Tutorial: Pseudobulk eQTL Analysis with cellink
    • Tutorial: eQTL Analysis with JaxQTL and TensorQTL using cellink
    • Tutorial: Annotating Genetic Variants with cellink
    • Tutorial: Rare Variant Association Testing with cellink
    • Tutorial: LD Clumping and Identifying Independent Signals with cellink
    • Tutorial: Colocalization Analysis - Linking eQTLs to GWAS Signals with cellink
    • Tutorial: Integrating GWAS with Single-Cell Data using cellink
    • Tutorial: Spatially Resolved GWAS Mapping with gsMap
    • Tutorial: eQTL Analysis with SAIGE-QTL using cellink
    • Tutorial: Using EHR Data as Donor-Level Input in cellink
    • Tutorial: Using the MILDataset and PyTorch DataLoader in cellink
  • API
    • DonorData
      • cellink.DonorData
    • Preprocessing pp
      • cellink.pp.variant_qc
      • cellink.pp.cell_level_obs_filter
      • cellink.pp.donor_level_obs_filter
      • cellink.pp.donor_level_var_filter
      • cellink.pp.log_transform
      • cellink.pp.low_abundance_filter
      • cellink.pp.missing_values_filter
      • cellink.pp.normalize
    • Input-Output io
      • cellink.io.from_sgkit_dataset
      • cellink.io.read_plink
      • cellink.io.read_bgen
      • cellink.io.read_sgkit_zarr
      • cellink.io.read_pgen_zarr
      • cellink.io.stream_pgen_to_zarr
      • cellink.io.to_plink
      • cellink.io.write_variants_to_vcf
    • Tools tl
      • cellink.tl.get_snp_df
      • cellink.tl.run_favor
      • cellink.tl.run_snpeff
      • cellink.tl.run_vep
      • cellink.tl.add_vep_annos_to_gdata
      • cellink.tl.combine_annotations
      • cellink.tl.aggregate_annotations_for_varm
      • cellink.tl.run_burden_test
      • cellink.tl.run_skat_test
      • cellink.tl.beta_weighting
    • External tools tl.external
      • cellink.tl.external.calculate_ld
      • cellink.tl.external.run_jaxqtl
      • cellink.tl.external.read_jaxqtl_results
      • cellink.tl.external.run_mixmil
      • cellink.tl.external.calculate_pcs
      • cellink.tl.external.run_tensorqtl
      • cellink.tl.external.read_tensorqtl_results
      • cellink.tl.external.run_scdrs
      • cellink.tl.external.run_seismic
      • cellink.tl.external.run_magma_pipeline
      • cellink.tl.external.run_saigeqtl
      • cellink.tl.external.configure_saigeqtl_runner
      • cellink.tl.external.get_saigeqtl_runner
      • cellink.tl.external.make_group_file
      • cellink.tl.external.read_saigeqtl_results
      • cellink.tl.external.load_gsmap_results
      • cellink.tl.external.format_gsmap_sumstats
    • Plotting
      • cellink.pl.locus
      • cellink.pl.manhattan
      • cellink.pl.qq
      • cellink.pl.expression_by_genotype
      • cellink.pl.volcano
    • Machine Learning ml
      • cellink.ml.MILDataset
      • cellink.ml.mil_collate_fn
      • cellink.ml.DonorMILModel
    • Association Testing at
      • cellink.at.acat_test
      • cellink.at.compute_acat
      • cellink.at.GWAS
      • cellink.at.Skat
    • Utils
      • cellink.utils.column_normalize
      • cellink.utils.gaussianize
      • cellink.utils.one_hot_encode_genotypes
      • cellink.utils.dosage_per_strand
    • Resources
      • cellink.resources.get_1000genomes
      • cellink.resources.get_1000genomes_grch38
      • cellink.resources.get_dummy_onek1k
      • cellink.resources.get_onek1k
      • cellink.resources.get_eqtl_catalog_dataset_associations
      • cellink.resources.get_eqtl_catalog_datasets
      • cellink.resources.get_gwas_catalog_studies
      • cellink.resources.get_gwas_catalog_study
      • cellink.resources.get_gwas_catalog_study_summary_stats
      • cellink.resources.get_pgs_catalog_score
      • cellink.resources.get_pgs_catalog_scores
      • cellink.resources.get_1000genomes_ld_scores
      • cellink.resources.get_1000genomes_ld_weights
  • Changelog
  • Contributing guide
  • References
  • .ipynb

Tutorial: eQTL Analysis with JaxQTL and TensorQTL using cellink

Contents

  • Environment Setup
  • Load and Prepare Data
  • Data Preprocessing
  • JaxQTL Analysis
    • Running JaxQTL with Default Parameters
    • Advanced JaxQTL Analysis with Custom Parameters
    • JaxQTL Trans-QTL Analysis
  • TensorQTL Analysis
    • Basic Cis-QTL Analysis with TensorQTL
    • Nominal Cis-QTL Analysis
    • Cis-Independent QTL Analysis
    • Trans-QTL Analysis with TensorQTL
    • SuSiE Finemapping with TensorQTL
      • Cis-SuSiE Finemapping
  • Advanced Usage: Dry Run and Command Generation
    • JaxQTL Command Generation
    • TensorQTL Command Generation
      • Saving Commands to Files

Tutorial: eQTL Analysis with JaxQTL and TensorQTL using cellink#

This tutorial demonstrates how to perform eQTL analysis using external tools JaxQTL and TensorQTL through the cellink package. The cellink package provides a unified interface to these powerful QTL mapping tools, making it easier to perform comprehensive genetic analyses on single-cell datasets. These tools provide powerful statistical methods for detecting quantitative trait loci (QTLs) in genomic data, with JaxQTL offering fast GPU-accelerated analysis and TensorQTL providing comprehensive cis- and trans-QTL mapping capabilities.

This notebook assumes familiarity with single-cell data processing and basic statistical genetics concepts. The cellink package provides convenient wrapper functions that handle data preparation and formatting for these external tools. JaxQTL currently is not available via pip or conda. Please follow the instructions here. To use TensorQTL you can install it via pip install 'cellink[tensorqtl]'. TensorQTL also requires plink2. For visualization of QTL calling results, please consider checking out the Tutorial: Pseudobulk eQTL Analysis with cellink.

Environment Setup#

We begin by importing necessary libraries and defining key parameters for our analysis. The cellink package provides wrapper functions for JaxQTL and TensorQTL that automatically handle data formatting and preparation.

import numpy as np

from cellink.resources import get_dummy_onek1k
from cellink.tl.external import run_jaxqtl, run_tensorqtl

# Analysis parameters
n_gpcs = 20
n_epcs = 15
batch_e_pcs_n_top_genes = 2000
chrom = 22
cis_window = 500_000
cell_type = "CD8 Naive"
celltype_key = "predicted.celltype.l2"
original_donor_col = "donor_id"

Load and Prepare Data#

We load the OneK1K dataset, which contains both genotype and single-cell expression data. We also add gene annotations from Ensembl, which are essential for defining genomic positions and cis-windows. (This is a subset of the full OneK1K dataset, which can be downloaded, and prepared using get_onek1k())

# Load the dataset
dd = get_dummy_onek1k(config_path="../../src/cellink/resources/config/dummy_onek1k.yaml", verify_checksum=False)
print(f"Dataset shape: {dd.shape}")

dd.G.obsm["gPCs"] = dd.G.obsm["gPCs"][dd.G.obsm["gPCs"].columns[:n_gpcs]]
[2025-12-29 03:06:35,629] INFO:root: /Users/larnoldt/cellink_data/dummy_onek1k/dummy_onek1k.dd.h5 already exists
[2025-12-29 03:06:35,629] WARNING:root: No checksum provided, skipping verification
[2025-12-29 03:06:36,994] INFO:root: Loaded dummy OneK1K dataset: (100, 146939, 125366, 34073)
Dataset shape: (100, 146939, 125366, 34073)

Data Preprocessing#

We filter the dataset to focus on a specific cell type and prepare the data for eQTL analysis.

dd.aggregate(obs=["donor_id", "sex", "age"], func="first", add_to_obs=True)
# Filter to specific cell type
dd = dd[..., dd.C.obs[celltype_key] == cell_type, :].copy()
print(f"After cell type filtering: {dd.shape}")

# Add donor-level metadata
dd.G.obs["donor_sex"] = dd.G.obs["sex"]
dd.G.obs["donor_age"] = dd.G.obs["age"]

# Generate random labels for demonstration (replace with real phenotypes)
dd.G.obs["donor_labels"] = np.random.randint(2, size=len(dd.G.obs))

# Filter to specific chromosome for faster analysis
dd = dd.sel(G_var=dd.G.var.chrom == str(chrom), C_var=dd.C.var.chrom == str(chrom)).copy()
print(f"After chromosome {chrom} filtering: {dd.shape}")
After cell type filtering: (100, 146939, 4756, 34073)
After chromosome 22 filtering: (100, 136776, 4756, 871)

To speed up the computation we also filter for the number of SNPs.

dd = dd[:, dd.G.var["pos"] < 17584955, :, :].copy()

JaxQTL Analysis#

JaxQTL is a fast, GPU-accelerated tool for QTL mapping. It supports various statistical models and can handle large-scale genomic data efficiently. JaxQTL can be run in various modes: ["nominal", "cis", "cis_acat", "fitnull", "covar", "trans", "estimate_ld_only"].

Running JaxQTL with Default Parameters#

The basic cis mode performs permutation-based cis-QTL mapping with empirical FDR estimation. This is the standard approach for identifying cis-eQTLs with appropriate multiple testing correction.

# Basic JaxQTL analysis
results_jaxqtl_basic = run_jaxqtl(
    dd,
    prefix="jaxqtl_basic",
    mode="cis",
    model="NB",  # Negative binomial model
    window=cis_window,
    additional_covariates=["gPCs"],
    run=True,
)
results_jaxqtl_basic
[2025-12-29 03:02:45,533] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:02:45,533] INFO:cellink._core.donordata: Observation found for 100 donors.
Writing FAM... done.
Writing BIM... done.
phenotype_id chrom num_var variant_id pos tss_distance ma_count af beta_shape1 beta_shape2 beta_converged opt_status true_nc pval_nominal slope slope_se pval_beta alpha_cov model_converged
0 ENSG00000177663 22 4161 22_17025474_T_C 17025474 -59481.0 35.0 0.175 52.109502 54.426431 1.0 True 1.942890e-16 0.482593 -0.744504 1.098446 0.447015 1.871933e-07 1.0
1 ENSG00000069998 22 4114 22_17212956_A_ATTTT 17212956 75444.0 5.0 0.975 21.068593 33.666325 1.0 True 1.942890e-16 0.422720 -1.353229 1.771991 0.721527 5.184935e-07 1.0
2 ENSG00000185837 22 4096 22_17035794_T_G 17035794 -123545.0 14.0 0.930 NaN NaN 0.0 True 3.267188e+00 0.999750 -23.416329 348.036692 NaN 1.780708e-06 1.0
3 ENSG00000093072 22 4081 22_16933780_G_A 16933780 -245011.0 1.0 0.995 21.659233 31.581654 1.0 True 1.942890e-16 0.391825 -2.492299 3.536479 0.418729 4.305565e-08 1.0
4 ENSG00000182902 22 2647 22_17577481_A_C 17577481 14030.0 1.0 0.995 2179.394106 3.329034 1.0 True 1.942890e-16 0.998652 -7.683877 333.268516 0.514215 2.774556e-03 1.0
5 ENSG00000236754 22 2608 22_17239813_C_T 17239813 -339574.0 7.0 0.965 1512.785525 2.894406 1.0 True 1.942890e-16 0.998596 -3.354731 201.541181 0.616794 1.314982e-05 1.0
6 ENSG00000131100 22 2563 22_17563390_G_C 17563390 -28747.0 69.0 0.345 20.036668 37.267971 1.0 True 1.942890e-16 0.282472 0.517867 0.489794 0.141640 9.334170e-05 1.0
7 ENSG00000099968 22 2407 22_17219920_C_T 17219920 -408936.0 59.0 0.295 13.350913 24.180304 1.0 True 1.942890e-16 0.334455 0.638422 0.675893 0.405157 7.940555e-05 1.0
8 ENSG00000015475 22 1724 22_17407251_T_C 17407251 -326888.0 3.0 0.985 36.027044 41.509726 1.0 True 1.942890e-16 0.537514 -1.237648 2.082027 0.900735 8.893963e-05 1.0
9 ENSG00000269220 22 1525 22_17351364_TTC_T 17351364 -425954.0 2.0 0.990 17.517949 19.103963 1.0 True 1.942890e-16 0.408392 -2.148595 2.861632 0.199174 1.000000e-08 1.0
10 ENSG00000243156 22 1494 22_17577481_A_C 17577481 -210169.0 1.0 0.995 9.233605 10.043320 1.0 True 1.942890e-16 0.569340 -3.940565 10.120073 0.786017 1.042847e-03 1.0
11 ENSG00000286990 22 1477 22_17550982_C_T 17550982 -249741.0 26.0 0.870 0.515312 0.087452 1.0 True 7.197109e+00 0.998998 4.655247 302.113200 0.640512 9.309502e-07 1.0
12 ENSG00000235445 22 490 22_17562393_A_T 17562393 -420813.0 35.0 0.175 1711.765504 2.414490 1.0 True 1.942890e-16 0.998003 2.431304 115.453803 0.216040 1.168994e-04 1.0
13 ENSG00000234913 22 393 22_17524692_CTTTTTTTTTTTTTT_C 17524692 -479579.0 1.0 0.995 4594.708261 4.565273 1.0 True 1.942890e-16 0.999388 8.537019 7763.970539 0.786336 4.152501e-06 1.0
14 ENSG00000225335 22 33 22_17582867_C_T 17582867 -492272.0 3.0 0.985 NaN NaN 0.0 True 1.058281e+00 0.999336 7.377999 726.025083 NaN 1.550050e-06 1.0
15 ENSG00000215193 22 17 22_17578563_A_G 17578563 -499361.0 65.0 0.675 12.825293 7.390283 1.0 True 1.942890e-16 0.634838 0.510581 1.085020 0.484888 5.840863e-07 1.0

Advanced JaxQTL Analysis with Custom Parameters#

The cis_acat mode uses the Aggregated Cauchy Association Test to combine p-values across multiple variants, providing a powerful approach for detecting associations when there are multiple causal variants in a region.

# Advanced JaxQTL analysis with more parameters
results_jaxqtl_advanced = run_jaxqtl(
    dd,
    prefix="jaxqtl_advanced",
    mode="cis_acat",  # ACAT-combined p-values
    model="gaussian",  # Gaussian model
    window=1000000,  # 1Mb window
    nperm=10000,  # Number of permutations
    test_method="wald",  # Wald test
    additional_covariates=["gPCs"],
    addpc=5,  # Number of genotype PCs to add
    standardize=True,
    verbose=True,
    run=True,
)
results_jaxqtl_advanced
[2025-12-29 03:03:39,828] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:03:39,829] INFO:cellink._core.donordata: Observation found for 100 donors.
Writing FAM... done.
Writing BIM... done.
chrom snp pos a0 a1 i phenotype_id tss tss_distance af ma_count pval_nominal slope slope_se converged alpha pval_acat
0 22 22_17009000_T_C 17009000 T C 1345 ENSG00000015475 17734139.0 -725139.0 0.920 16.0 2.877574e-03 -0.060722 0.019666 True 0.0 5.803468e-01
1 22 22_17524093_G_A 17524093 G A 3994 ENSG00000069998 17137512.0 386581.0 0.985 3.0 1.970923e-04 -0.144284 0.036742 True 0.0 5.543843e-01
2 22 22_17545374_C_CAAA 17545374 C CAAA 4109 ENSG00000093072 17178791.0 366583.0 0.960 8.0 1.307348e-05 -0.075339 0.016078 True 0.0 5.631898e-02
3 22 22_16849850_C_CA 16849850 C CA 734 ENSG00000099968 17628856.0 -779006.0 0.980 4.0 1.302300e-04 -0.146788 0.036267 True 0.0 3.125808e-02
4 22 22_17019092_T_A 17019092 T A 1388 ENSG00000131100 17592137.0 -573045.0 0.995 1.0 7.546834e-05 -0.397131 0.094481 True 0.0 1.132195e-01
5 22 22_17349688_C_T 17349688 C T 3007 ENSG00000177663 17084955.0 264733.0 0.955 9.0 9.499886e-04 0.060594 0.017568 True 0.0 5.397122e-01
6 22 22_16702145_A_G 16702145 A G 236 ENSG00000182902 17563451.0 -861306.0 0.980 4.0 4.909324e-09 -0.004212 0.000633 True 0.0 8.408108e-06
7 22 22_17547967_G_A 17547967 G A 4129 ENSG00000183785 18110101.0 -562134.0 0.985 3.0 1.428733e-09 -0.036549 0.005256 True 0.0 3.574742e-06
8 22 22_17225888_G_A 17225888 G A 2512 ENSG00000184979 18149844.0 -923956.0 0.995 1.0 1.143822e-14 -0.187680 0.019322 True 0.0 2.643713e-11
9 22 22_17031458_ATT_A 17031458 ATT A 1450 ENSG00000185837 17159339.0 -127881.0 0.995 1.0 8.238284e-16 -0.062261 0.006021 True 0.0 3.682221e-12
10 22 22_17165668_G_A 17165668 G A 2054 ENSG00000215193 18077924.0 -912256.0 0.980 4.0 1.082902e-04 -0.114595 0.027947 True 0.0 3.421433e-02
11 22 22_17393483_G_A 17393483 G A 3181 ENSG00000225335 18075139.0 -681656.0 0.990 2.0 2.106266e-71 -0.248278 0.003095 True 0.0 5.514205e-68
12 22 22_17413663_G_GATT 17413663 G GATT 3273 ENSG00000234913 18004271.0 -590608.0 0.995 1.0 1.423345e-20 -0.030281 0.002317 True 0.0 4.159015e-17
13 22 22_17335813_T_TA 17335813 T TA 2979 ENSG00000235445 17983206.0 -647393.0 0.975 5.0 2.441747e-07 -0.002361 0.000413 True 0.0 4.283879e-04
14 22 22_17238021_G_C 17238021 G C 2557 ENSG00000236754 17579387.0 -341366.0 0.965 7.0 7.724801e-08 -0.001254 0.000209 True 0.0 1.892409e-04
15 22 22_17031458_ATT_A 17031458 ATT A 1450 ENSG00000243156 17787650.0 -756192.0 0.995 1.0 1.859829e-07 -0.112233 0.019426 True 0.0 6.952086e-04
16 22 22_17351364_TTC_T 17351364 TTC T 3020 ENSG00000269220 17777318.0 -425954.0 0.990 2.0 4.115031e-08 -0.136962 0.022279 True 0.0 3.119761e-05
17 22 22_17552774_G_GTTTTTTT 17552774 G GTTTTTTT 4149 ENSG00000278558 18527803.0 -975029.0 0.660 68.0 8.157338e-03 0.013569 0.004985 True 0.0 5.006843e-01
18 22 22_17421100_T_A 17421100 T A 3324 ENSG00000280007 18110760.0 -689660.0 0.995 1.0 9.481115e-11 -0.094931 0.012502 True 0.0 7.866920e-08
19 22 22_17435684_T_TAA 17435684 T TAA 3466 ENSG00000286990 17800723.0 -365039.0 0.980 4.0 9.338400e-15 -0.025140 0.002575 True 0.0 3.486283e-11

JaxQTL Trans-QTL Analysis#

Trans-QTL analysis identifies associations between variants and genes on different chromosomes. These typically have smaller effect sizes than cis-QTLs and require larger sample sizes to detect.

# Trans-QTL analysis with JaxQTL
results_jaxqtl_trans = run_jaxqtl(
    dd, prefix="jaxqtl_trans", mode="trans", model="gaussian", additional_covariates=["gPCs"], run=True
)
results_jaxqtl_trans
[2025-12-29 03:04:00,048] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:04:00,048] INFO:cellink._core.donordata: Observation found for 100 donors.
Writing FAM... done.
Writing BIM... done.
chrom snp pos a0 a1 i phenotype_id tss tss_distance af ma_count pval_nominal slope slope_se converged alpha
0 22 22_16388891_G_A 16388891 G A 0 ENSG00000177663 17084955.0 -696064.0 0.945 11.0 0.296205 -0.021325 0.020414 1.0 0.0
1 22 22_16388968_C_T 16388968 C T 1 ENSG00000177663 17084955.0 -695987.0 0.945 11.0 0.296205 -0.021325 0.020414 1.0 0.0
2 22 22_16389525_A_G 16389525 A G 2 ENSG00000177663 17084955.0 -695430.0 0.945 11.0 0.296205 -0.021325 0.020414 1.0 0.0
3 22 22_16390411_G_A 16390411 G A 3 ENSG00000177663 17084955.0 -694544.0 0.945 11.0 0.296205 -0.021325 0.020414 1.0 0.0
4 22 22_16391555_G_C 16391555 G C 4 ENSG00000177663 17084955.0 -693400.0 0.945 11.0 0.296205 -0.021325 0.020414 1.0 0.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2105019 22 22_17583056_C_CA 17583056 C CA 4272 ENSG00000079974 50767502.0 -33184446.0 0.830 34.0 0.504663 0.009504 0.014245 1.0 0.0
2105020 22 22_17583078_A_G 17583078 A G 4273 ENSG00000079974 50767502.0 -33184424.0 0.830 34.0 0.504663 0.009504 0.014245 1.0 0.0
2105021 22 22_17584465_GAGAGAGAA_G 17584465 GAGAGAGAA G 4274 ENSG00000079974 50767502.0 -33183037.0 0.995 1.0 0.761113 -0.022471 0.073914 1.0 0.0
2105022 22 22_17584467_GAGAGAA_G 17584467 GAGAGAA G 4275 ENSG00000079974 50767502.0 -33183035.0 0.970 6.0 0.649968 -0.014370 0.031666 1.0 0.0
2105023 22 22_17584471_GAA_G 17584471 GAA G 4276 ENSG00000079974 50767502.0 -33183031.0 0.980 4.0 0.336810 -0.033426 0.034802 1.0 0.0

2105024 rows × 16 columns

TensorQTL Analysis#

TensorQTL provides comprehensive QTL mapping capabilities with support for various analysis modes and statistical approaches. TensorQTL can be run in various modes: ["cis_nominal", "cis_independent", "cis", "trans", "cis_susie", "trans_susie"].

Basic Cis-QTL Analysis with TensorQTL#

The basic cis mode in TensorQTL performs permutation-based cis-QTL mapping, similar to JaxQTL but with TensorFlow-based GPU acceleration.

# Basic cis-QTL mapping
results_tensorqtl_cis = run_tensorqtl(
    dd,
    prefix="tensorqtl_cis",
    mode="cis",
    window=cis_window,
    additional_covariates=["gPCs"],
    permutations=10000,
    run=True,
)
results_tensorqtl_cis
[2025-12-29 03:06:59,460] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:06:59,461] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:06:59,715] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cis.log.
Options in effect:
  --bfile tensorqtl_cis
  --make-pgen
  --out tensorqtl_cis

Start time: Mon Dec 29 03:06:59 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cis.fam.
4277 variants loaded from tensorqtl_cis.bim.
Note: No phenotype data present.
Writing tensorqtl_cis.psam ... done.
Writing tensorqtl_cis.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cis.pgen ... done.
End time: Mon Dec 29 03:06:59 2025
[Dec 29 03:07:06] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_cis_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_cis_donor_features.tsv)
  * loading genotypes
cis-QTL mapping: empirical p-values for phenotypes
  * 100 samples
  * 871 phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
    ** dropping 375 constant phenotypes
  * checking phenotypes: 496/496
    ** dropping 480 phenotypes without variants in cis-window
  * computing permutations
    processing phenotype 8/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 9/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 10/16    * WARNING: excluding 32 monomorphic variants
    processing phenotype 11/16    * WARNING: excluding 31 monomorphic variants
    processing phenotype 12/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 13/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 14/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 15/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 16/16
    * WARNING: excluding 10 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
  Time elapsed: 0.04 min
done.
  * writing output
Computing q-values
  * Number of phenotypes tested: 16
  * Correlation between Beta-approximated and empirical p-values: nan
  * Proportion of significant phenotypes (1-pi0): 0.00
  * QTL phenotypes @ FDR 0.05: 0
[Dec 29 03:07:09] Finished mapping
phenotype_id num_var beta_shape1 beta_shape2 true_df pval_true_df variant_id start_distance end_distance ma_samples ma_count af pval_nominal slope slope_se pval_perm pval_beta qval
0 ENSG00000177663 4161 0.997280 407.17500 61.42510 7.592840e-04 22_17404016_TTGGGAGATG_T 319061 288323 71 87 0.565 1.965230e-04 0.031859 0.008135 0.273173 0.267237 0.728300
1 ENSG00000069998 4114 1.015660 138.48000 38.93290 3.588760e-03 22_17524093_G_A 386581 358806 3 3 0.985 5.029320e-05 -0.174374 0.040528 0.382562 0.384575 0.769149
2 ENSG00000185837 4096 0.797147 12.47490 9.12607 4.709830e-03 22_17031458_ATT_A -127881 -133993 1 1 0.995 1.200610e-16 -0.061286 0.005758 0.104890 0.108845 0.728300
3 ENSG00000093072 4081 0.926989 47.02260 25.15310 4.476540e-03 22_17560607_G_GCTCTCC 381816 302372 3 3 0.985 7.816000e-07 -0.199937 0.037089 0.259374 0.220129 0.728300
4 ENSG00000182902 2647 NaN NaN 75.00000 NaN 22_17577481_A_C 14030 -13514 1 1 0.995 NaN -0.012346 NaN 0.052395 NaN NaN
5 ENSG00000236754 2608 NaN NaN 75.00000 4.741170e-08 22_17239813_C_T -339574 -349379 7 7 0.965 4.741170e-08 -0.001256 0.000207 0.787621 NaN NaN
6 ENSG00000131100 2563 1.003070 119.93700 43.16060 8.749770e-03 22_17360965_T_C -231172 -267784 30 30 0.850 5.315510e-04 -0.088390 0.024414 0.652235 0.650188 0.962637
7 ENSG00000099968 2407 0.763298 18.40660 19.73950 9.908240e-02 22_17219920_C_T -408936 -510935 49 59 0.295 1.175950e-03 0.039427 0.011686 0.909409 0.903363 0.962637
8 ENSG00000015475 1724 1.038670 146.92700 55.16530 2.263510e-02 22_17345140_C_T -388999 -429630 2 2 0.990 7.788650e-03 -0.137268 0.050196 0.961504 0.962637 0.962637
9 ENSG00000269220 1525 0.664014 9.55225 16.92340 7.565680e-03 22_17351364_TTC_T -425954 -429155 2 2 0.990 1.309410e-08 -0.144193 0.022597 0.244576 0.186541 0.728300
10 ENSG00000243156 1494 0.898013 14.83950 18.30200 1.113210e-01 22_17566826_C_G -220824 -457735 3 3 0.985 1.128620e-03 -0.041466 0.012243 0.830717 0.851997 0.962637
11 ENSG00000286990 1477 0.528184 4.81984 8.79398 2.015620e-02 22_17435684_T_TAA -365039 -366351 3 4 0.980 3.720230e-12 -0.020749 0.002510 0.239476 0.312129 0.728300
12 ENSG00000235445 490 1.000000 3.85027 6.99940 3.334120e-01 22_17523014_A_AGGGG -460192 -460605 30 32 0.840 1.080330e-03 -0.000655 0.000193 0.823418 0.790201 0.962637
13 ENSG00000234913 393 0.459678 4.05765 14.42780 9.079320e-03 22_17560607_G_GCTCTCC -443664 -446701 3 3 0.985 1.652040e-09 -0.015233 0.002218 0.133187 0.238033 0.728300
14 ENSG00000225335 33 0.206226 1.35911 41.19860 5.304390e-01 22_17575315_G_A -499824 -503569 4 4 0.980 3.960240e-01 -0.012890 0.015100 0.975702 0.922256 0.962637
15 ENSG00000215193 17 0.575820 4.20176 58.76220 1.249960e-01 22_17578563_A_G -499361 -526833 54 65 0.675 8.277940e-02 0.016903 0.009613 0.572643 0.651453 0.962637

Nominal Cis-QTL Analysis#

Nominal mode tests all variant-gene pairs within the specified window and returns comprehensive association statistics. This mode can also output top associations per gene and significant pairs based on a p-value threshold.

# Nominal cis-QTL analysis (all variant-gene pairs)
results_tensorqtl_nominal = run_tensorqtl(
    dd,
    prefix="tensorqtl_nominal",
    mode="cis_nominal",
    window=cis_window,
    pval_threshold=1e-5,
    additional_covariates=["gPCs"],
    batch_size=20000,
    run=True,
)

# Results contain multiple outputs for nominal mode
cis_qtl_pairs, cis_qtl_signif_pairs, cis_qtl_top_assoc = results_tensorqtl_nominal
print(f"Total variant-gene pairs tested: {len(cis_qtl_pairs)}")
print(f"Significant pairs (p < {1e-5}): {len(cis_qtl_signif_pairs) if cis_qtl_signif_pairs is not None else 0}")
cis_qtl_pairs
[2025-12-29 03:07:09,832] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:09,834] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:10,153] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_nominal.log.
Options in effect:
  --bfile tensorqtl_nominal
  --make-pgen
  --out tensorqtl_nominal

Start time: Mon Dec 29 03:07:10 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_nominal.fam.
4277 variants loaded from tensorqtl_nominal.bim.
Note: No phenotype data present.
Writing tensorqtl_nominal.psam ... done.
Writing tensorqtl_nominal.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_nominal.pgen ... done.
End time: Mon Dec 29 03:07:10 2025
[Dec 29 03:07:14] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_nominal_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_nominal_donor_features.tsv)
  * loading genotypes
cis-QTL mapping: nominal associations for all variant-phenotype pairs
  * 100 samples
  * 871 phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
    ** dropping 375 constant phenotypes
  * checking phenotypes: 496/496
    ** dropping 480 phenotypes without variants in cis-window
  * Computing associations
    Mapping chromosome 22
    processing phenotype 16/16
    time elapsed: 0.00 min
    * writing output
done.
[Dec 29 03:07:14] Finished mapping
Total variant-gene pairs tested: 34114
Significant pairs (p < 1e-05): 0
phenotype_id variant_id start_distance end_distance af ma_samples ma_count pval_nominal slope slope_se
0 ENSG00000177663 22_16585130_T_C -499825 -530563 0.980 4 4 0.339680 0.032829 0.034164
1 ENSG00000177663 22_16585144_G_A -499811 -530549 0.905 17 19 0.402192 0.011152 0.013236
2 ENSG00000177663 22_16585510_C_G -499445 -530183 0.500 71 100 0.109400 -0.012162 0.007507
3 ENSG00000177663 22_16585603_G_A -499352 -530090 0.740 44 52 0.395744 0.007709 0.009025
4 ENSG00000177663 22_16585810_T_C -499145 -529883 0.730 43 54 0.478267 0.006274 0.008803
... ... ... ... ... ... ... ... ... ... ...
34109 ENSG00000215193 22_17583056_C_CA -494868 -522340 0.830 30 34 0.274086 0.013370 0.012135
34110 ENSG00000215193 22_17583078_A_G -494846 -522318 0.830 30 34 0.274086 0.013370 0.012135
34111 ENSG00000215193 22_17584465_GAGAGAGAA_G -493459 -520931 0.995 1 1 0.649537 -0.028478 0.062419
34112 ENSG00000215193 22_17584467_GAGAGAA_G -493457 -520929 0.970 6 6 0.833054 0.005817 0.027500
34113 ENSG00000215193 22_17584471_GAA_G -493453 -520925 0.980 4 4 0.093625 0.049907 0.029389

34114 rows × 10 columns

Cis-Independent QTL Analysis#

The cis_independent mode identifies multiple independent association signals for each gene by performing conditional analysis. This is crucial for understanding the genetic architecture when multiple causal variants affect the same gene.

# First run basic cis mode to get top associations
results_tensorqtl_cis = run_tensorqtl(
    dd,
    prefix="tensorqtl_cis",
    mode="cis",
    window=cis_window,
    additional_covariates=["gPCs"],
    permutations=10000,
    fdr=0.9,  # Very permissive for demo
    run=True,
)

# Now identify independent signals conditioning on the top hits
results_tensorqtl_independent = run_tensorqtl(
    dd,
    prefix="tensorqtl_independent",
    mode="cis_independent",
    window=cis_window,
    cis_output="tensorqtl_cis.cis_qtl.txt.gz",  # Use output from cis mode
    additional_covariates=["gPCs"],
    fdr=0.9,  # Very permissive for demo
    pval_threshold=0.1,  # Very permissive for demo
    run=True,
)

print(f"Independent signals identified: {len(results_tensorqtl_independent)}")
results_tensorqtl_independent
[2025-12-29 03:07:14,748] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:14,749] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:14,916] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cis.log.
Options in effect:
  --bfile tensorqtl_cis
  --make-pgen
  --out tensorqtl_cis

Start time: Mon Dec 29 03:07:15 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cis.fam.
4277 variants loaded from tensorqtl_cis.bim.
Note: No phenotype data present.
Writing tensorqtl_cis.psam ... done.
Writing tensorqtl_cis.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cis.pgen ... done.
End time: Mon Dec 29 03:07:15 2025
[Dec 29 03:07:19] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_cis_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_cis_donor_features.tsv)
  * loading genotypes
cis-QTL mapping: empirical p-values for phenotypes
  * 100 samples
  * 871 phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
    ** dropping 375 constant phenotypes
  * checking phenotypes: 496/496
    ** dropping 480 phenotypes without variants in cis-window
  * computing permutations
    processing phenotype 8/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 9/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 10/16    * WARNING: excluding 32 monomorphic variants
    processing phenotype 11/16    * WARNING: excluding 31 monomorphic variants
    processing phenotype 12/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 13/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 14/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 15/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 16/16
    * WARNING: excluding 10 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
  Time elapsed: 0.04 min
done.
  * writing output
Computing q-values
  * Number of phenotypes tested: 16
  * Correlation between Beta-approximated and empirical p-values: nan
  * Proportion of significant phenotypes (1-pi0): 0.00
  * QTL phenotypes @ FDR 0.90: 7
  * min p-value threshold @ FDR 0.9: 0.518721
[Dec 29 03:07:22] Finished mapping
[2025-12-29 03:07:22,502] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:22,502] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:22,668] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_independent.log.
Options in effect:
  --bfile tensorqtl_independent
  --make-pgen
  --out tensorqtl_independent

Start time: Mon Dec 29 03:07:22 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_independent.fam.
4277 variants loaded from tensorqtl_independent.bim.
Note: No phenotype data present.
Writing tensorqtl_independent.psam ... done.
Writing tensorqtl_independent.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_independent.pgen ... done.
End time: Mon Dec 29 03:07:22 2025
[Dec 29 03:07:26] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_independent_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_independent_donor_features.tsv)
  * loading genotypes
cis-QTL mapping: conditionally independent variants
  * 100 samples
  * 7/16 significant phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
  * checking phenotypes: 7/7
  * computing independent QTLs
    processing phenotype 7/7
  Time elapsed: 0.04 min
done.
  * writing output
[Dec 29 03:07:29] Finished mapping
Independent signals identified: 9
phenotype_id num_var beta_shape1 beta_shape2 true_df pval_true_df variant_id start_distance end_distance ma_samples ma_count af pval_nominal slope slope_se pval_perm pval_beta rank
0 ENSG00000177663 4194 1.015110 423.35200 60.11670 0.000004 22_17404016_TTGGGAGATG_T 319061 288323 71 87 0.565 3.452950e-07 0.037838 0.006747 0.001800 0.001449 1
1 ENSG00000177663 4194 1.003600 440.10500 60.74600 0.000062 22_17546485_C_CA 461530 430792 2 2 0.990 1.124440e-05 -0.161322 0.034204 0.023898 0.026704 2
2 ENSG00000177663 4194 1.008040 429.60400 60.42130 0.000533 22_17214059_A_G 129104 98366 27 28 0.860 1.389950e-04 0.039808 0.009897 0.200780 0.201398 3
3 ENSG00000069998 4114 1.039220 127.93900 37.89910 0.004069 22_17524093_G_A 386581 358806 3 3 0.985 5.029320e-05 -0.174374 0.040528 0.390561 0.387522 1
4 ENSG00000185837 4096 0.773050 13.02420 9.41844 0.004061 22_17031458_ATT_A -127881 -133993 1 1 0.995 1.200610e-16 -0.061286 0.005758 0.104590 0.108415 1
5 ENSG00000093072 4081 0.938737 45.70470 24.66020 0.004894 22_17560607_G_GCTCTCC 381816 302372 3 3 0.985 7.816000e-07 -0.199937 0.037089 0.265473 0.226310 1
6 ENSG00000269220 1525 0.655935 9.81804 17.30170 0.006905 22_17351364_TTC_T -425954 -429155 2 2 0.990 1.309410e-08 -0.144193 0.022597 0.237676 0.183322 1
7 ENSG00000286990 1477 0.522711 4.71656 8.67813 0.021041 22_17435684_T_TAA -365039 -366351 3 4 0.980 3.720230e-12 -0.020749 0.002510 0.247375 0.319513 1
8 ENSG00000234913 393 0.473676 4.08297 14.34160 0.009296 22_17560607_G_GCTCTCC -443664 -446701 3 3 0.985 1.652040e-09 -0.015233 0.002218 0.133487 0.230412 1

Trans-QTL Analysis with TensorQTL#

Trans-QTL mapping with TensorQTL tests associations between variants and distant genes. Due to the large number of tests, stricter p-value thresholds are typically required.

# Trans-QTL mapping
results_tensorqtl_trans = run_tensorqtl(
    dd,
    prefix="tensorqtl_trans",
    mode="trans",
    pval_threshold=1e-6,  # Stricter threshold for trans
    additional_covariates=["gPCs"],
    batch_size=10000,  # Smaller batches for trans analysis
    return_r2=True,  # Include R² statistics
    run=True,
)
results_tensorqtl_trans
[2025-12-29 03:07:29,508] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:29,508] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:29,677] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_trans.log.
Options in effect:
  --bfile tensorqtl_trans
  --make-pgen
  --out tensorqtl_trans

Start time: Mon Dec 29 03:07:29 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_trans.fam.
4277 variants loaded from tensorqtl_trans.bim.
Note: No phenotype data present.
Writing tensorqtl_trans.psam ... done.
Writing tensorqtl_trans.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_trans.pgen ... done.
End time: Mon Dec 29 03:07:29 2025
[Dec 29 03:07:34] Running TensorQTL v1.0.10: trans-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_trans_phenotype.bed.gz)
  * reading covariates (tensorqtl_trans_donor_features.tsv)
  * loading genotypes
  * p-value threshold: 1e-06
trans-QTL mapping
  * 100 samples
  * 871 phenotypes
  * 23 covariates
  * 4277 variants
    processing batch 1/1
    elapsed time: 0.00 min
done.
  * filtering out cis-QTLs (within +/-5Mb)
  * writing sparse output
[Dec 29 03:07:34] Finished mapping
variant_id phenotype_id pval b b_se r2 af
0 22_16393930_G_T ENSG00000273350 2.124748e-07 -0.002581 0.000452 0.303167 0.980
1 22_16393939_G_C ENSG00000273350 2.124748e-07 -0.002581 0.000452 0.303167 0.980
3 22_16414733_G_A ENSG00000133460 6.571755e-08 -0.264812 0.044160 0.324078 0.985
4 22_16414733_G_A ENSG00000099999 1.474357e-07 -0.062907 0.010844 0.309742 0.985
5 22_16414733_G_A ENSG00000273076 1.771229e-07 -0.062787 0.010907 0.306448 0.985
... ... ... ... ... ... ... ...
2475 22_17582333_G_A ENSG00000128165 7.672292e-07 -0.053877 0.009986 0.279612 0.995
2477 22_17582867_C_T ENSG00000100156 1.227405e-09 -0.002649 0.000382 0.390730 0.985
2478 22_17582867_C_T ENSG00000286381 2.763245e-07 -0.054595 0.009666 0.298404 0.985
2480 22_17584465_GAGAGAGAA_G ENSG00000128165 7.672292e-07 -0.053877 0.009986 0.279612 0.995
2481 22_17584471_GAA_G ENSG00000213888 8.686729e-08 -0.039012 0.006579 0.319159 0.980

2139 rows × 7 columns

SuSiE Finemapping with TensorQTL#

SuSiE (Sum of Single Effects) is a powerful finemapping method that identifies credible sets of causal variants for each independent association signal. It accounts for LD structure and provides posterior probabilities for each variant being causal.

Cis-SuSiE Finemapping#

Cis-SuSiE performs finemapping on the top cis-QTL signals to identify credible sets of potentially causal variants. This requires the output from a previous cis-QTL run.

# First, run standard cis-QTL analysis if not already done
results_tensorqtl_cis = run_tensorqtl(
    dd,
    prefix="tensorqtl_cis_for_susie",
    mode="cis",
    window=cis_window,
    additional_covariates=["gPCs"],
    permutations=10000,
    run=True,
)

# Perform SuSiE finemapping on cis-QTL signals
results_tensorqtl_cis_susie = run_tensorqtl(
    dd,
    prefix="tensorqtl_cis_susie",
    mode="cis_susie",
    window=cis_window,
    cis_output="tensorqtl_cis_for_susie.cis_qtl.txt.gz",  # Input from cis mode
    additional_covariates=["gPCs"],
    max_effects=10,  # Maximum number of causal effects per gene
    fdr=0.9,  # Very permissive for demo, FDR threshold for selecting genes to finemapping
    run=True,
)

# Results contain SuSiE model objects and summary statistics
susie_models, susie_summary = results_tensorqtl_cis_susie

print(f"Genes with finemapping results: {len(susie_models)}")
print("\nSummary statistics columns:")
print(susie_summary.columns.tolist())
print(f"\nTotal credible sets identified: {len(susie_summary)}")

# Examine credible sets for a specific gene
if len(susie_summary) > 0:
    example_gene = susie_summary["phenotype_id"].iloc[0]
    gene_credible_sets = susie_summary[susie_summary["phenotype_id"] == example_gene]
    print(f"\nCredible sets for {example_gene}:")
    print(gene_credible_sets)
[2025-12-29 03:07:34,596] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:34,596] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:34,861] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cis_for_susie.log.
Options in effect:
  --bfile tensorqtl_cis_for_susie
  --make-pgen
  --out tensorqtl_cis_for_susie

Start time: Mon Dec 29 03:07:34 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cis_for_susie.fam.
4277 variants loaded from tensorqtl_cis_for_susie.bim.
Note: No phenotype data present.
Writing tensorqtl_cis_for_susie.psam ... done.
Writing tensorqtl_cis_for_susie.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cis_for_susie.pgen ... done.
End time: Mon Dec 29 03:07:34 2025
[Dec 29 03:07:38] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_cis_for_susie_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_cis_for_susie_donor_features.tsv)
  * loading genotypes
cis-QTL mapping: empirical p-values for phenotypes
  * 100 samples
  * 871 phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
    ** dropping 375 constant phenotypes
  * checking phenotypes: 496/496
    ** dropping 480 phenotypes without variants in cis-window
  * computing permutations
    processing phenotype 8/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 9/16    * WARNING: excluding 33 monomorphic variants
    processing phenotype 10/16    * WARNING: excluding 32 monomorphic variants
    processing phenotype 11/16    * WARNING: excluding 31 monomorphic variants
    processing phenotype 12/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 13/16    * WARNING: excluding 30 monomorphic variants
WARNING: scipy.optimize.newton failed to converge (running scipy.optimize.minimize)
    processing phenotype 14/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 15/16    * WARNING: excluding 30 monomorphic variants
    processing phenotype 16/16
    * WARNING: excluding 10 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 7 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
    * WARNING: excluding 2 monomorphic variants
  Time elapsed: 0.04 min
done.
  * writing output
Computing q-values
  * Number of phenotypes tested: 16
  * Correlation between Beta-approximated and empirical p-values: nan
  * Proportion of significant phenotypes (1-pi0): 0.00
  * QTL phenotypes @ FDR 0.05: 0
[Dec 29 03:07:41] Finished mapping
[2025-12-29 03:07:42,336] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:07:42,336] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:07:42,499] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cis_susie.log.
Options in effect:
  --bfile tensorqtl_cis_susie
  --make-pgen
  --out tensorqtl_cis_susie

Start time: Mon Dec 29 03:07:42 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cis_susie.fam.
4277 variants loaded from tensorqtl_cis_susie.bim.
Note: No phenotype data present.
Writing tensorqtl_cis_susie.psam ... done.
Writing tensorqtl_cis_susie.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cis_susie.pgen ... done.
End time: Mon Dec 29 03:07:42 2025
[Dec 29 03:07:46] Running TensorQTL v1.0.10: cis-QTL mapping
  * WARNING: using CPU!
  * reading phenotypes (tensorqtl_cis_susie_phenotype.bed.gz)
  * cis-window detected as [start - 500,000, end + 500,000]
  * reading covariates (tensorqtl_cis_susie_donor_features.tsv)
  * loading genotypes
SuSiE fine-mapping
  * 100 samples
  * 7 phenotypes
  * 23 covariates
  * 4277 variants
  * cis-window: ±500,000
  * checking phenotypes: 7/7
  * fine-mapping
    processing phenotype 7/7
  Time elapsed: 0.01 min
done.
[Dec 29 03:07:47] Finished mapping
Genes with finemapping results: 5

Summary statistics columns:
['phenotype_id', 'variant_id', 'pip', 'af', 'cs_id']

Total credible sets identified: 12

Credible sets for ENSG00000185837:
      phenotype_id         variant_id  pip     af cs_id
0  ENSG00000185837  22_17031458_ATT_A  1.0  0.995     1

Advanced Usage: Dry Run and Command Generation#

Both JaxQTL and TensorQTL support generating commands without execution, which is useful for debugging or running on compute clusters. This is controlled by the run argument.

JaxQTL Command Generation#

# Generate JaxQTL command without running
jaxqtl_command = run_jaxqtl(
    dd,
    prefix="jaxqtl_cluster",
    mode="cis",
    additional_covariates=["gPCs"],
    run=False,  # Don't execute, just return command
)

print("JaxQTL command:")
print(jaxqtl_command)
[2025-12-29 03:05:56,865] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:05:56,866] INFO:cellink._core.donordata: Observation found for 100 donors.
Writing FAM... done.
Writing BIM... done.
JaxQTL command:
jaxqtl --geno jaxqtl_cluster --covar jaxqtl_cluster_donor_features.tsv --pheno jaxqtl_cluster_phenotype.bed.gz --model NB --mode cis --test-method score --window 500000 --nperm 1000 --addpc 2 --out jaxqtl_cluster --standardize

TensorQTL Command Generation#

# Generate TensorQTL command without running
tensorqtl_command = run_tensorqtl(
    dd,
    prefix="tensorqtl_cluster",
    mode="cis",
    additional_covariates=["gPCs"],
    run=False,  # Don't execute, just return command
)

print("\nTensorQTL command:")
print(tensorqtl_command)
[2025-12-29 03:08:06,239] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:08:06,240] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:08:06,429] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cluster.log.
Options in effect:
  --bfile tensorqtl_cluster
  --make-pgen
  --out tensorqtl_cluster

Start time: Mon Dec 29 03:08:06 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cluster.fam.
4277 variants loaded from tensorqtl_cluster.bim.
Note: No phenotype data present.
Writing tensorqtl_cluster.psam ... done.
Writing tensorqtl_cluster.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cluster.pgen ... done.
End time: Mon Dec 29 03:08:06 2025

TensorQTL command:
python -m tensorqtl tensorqtl_cluster tensorqtl_cluster_phenotype.bed.gz tensorqtl_cluster --covariates tensorqtl_cluster_donor_features.tsv --mode cis --permutations 10000 --window 1000000 --pval_threshold 1e-05 --maf_threshold 0 --maf_threshold_interaction 0.05 --batch_size 20000 --warn_monomorphic --max_effects 10 --fdr 0.05

Saving Commands to Files#

You can also save the generated commands to files for batch submission to compute clusters:

# Save JaxQTL command to file
jaxqtl_command = run_jaxqtl(
    dd,
    prefix="jaxqtl_cluster",
    mode="cis_acat",
    model="NB",
    window=cis_window,
    additional_covariates=["gPCs"],
    run=False,
    save_cmd_file="jaxqtl_job.sh",
)
[2025-12-29 03:06:08,165] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:06:08,166] INFO:cellink._core.donordata: Observation found for 100 donors.
Writing FAM... done.
Writing BIM... done.
# Save TensorQTL command to file
tensorqtl_command = run_tensorqtl(
    dd,
    prefix="tensorqtl_cluster",
    mode="cis_susie",
    cis_output="previous_cis_results.txt.gz",
    additional_covariates=["gPCs"],
    run=False,
    save_cmd_file="tensorqtl_susie_job.sh",
)
[2025-12-29 03:08:11,429] INFO:cellink._core.donordata: Aggregated X to PB
[2025-12-29 03:08:11,431] INFO:cellink._core.donordata: Observation found for 100 donors.
[2025-12-29 03:08:11,735] INFO:cellink.tl.external._tensorqtl: Performing z-normalization of age.
Writing FAM... done.
Writing BIM... done.
PLINK v2.0.0-a.6.9 64-bit (29 Jan 2025)            cog-genomics.org/plink/2.0/
(C) 2005-2025 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to tensorqtl_cluster.log.
Options in effect:
  --bfile tensorqtl_cluster
  --make-pgen
  --out tensorqtl_cluster

Start time: Mon Dec 29 03:08:11 2025
24576 MiB RAM detected; reserving 12288 MiB for main workspace.
Using up to 8 compute threads.
100 samples (0 females, 40 males, 60 ambiguous; 100 founders) loaded from
tensorqtl_cluster.fam.
4277 variants loaded from tensorqtl_cluster.bim.
Note: No phenotype data present.
Writing tensorqtl_cluster.psam ... done.
Writing tensorqtl_cluster.pvar ... 1010111112121313141415151616171718181919202021212222232324242525262627272828292930303131323233333434353536363737383839394040414142424343444445454646474748484949505051515252535354545555565657575858595960606161626263636464656566666767686869697070717172727373747475757676777778787979808081818282838384848585868687878888898990909191929293939494959596969797989899done.
Writing tensorqtl_cluster.pgen ... done.
End time: Mon Dec 29 03:08:11 2025

previous

Tutorial: Pseudobulk eQTL Analysis with cellink

next

Tutorial: Annotating Genetic Variants with cellink

Contents
  • Environment Setup
  • Load and Prepare Data
  • Data Preprocessing
  • JaxQTL Analysis
    • Running JaxQTL with Default Parameters
    • Advanced JaxQTL Analysis with Custom Parameters
    • JaxQTL Trans-QTL Analysis
  • TensorQTL Analysis
    • Basic Cis-QTL Analysis with TensorQTL
    • Nominal Cis-QTL Analysis
    • Cis-Independent QTL Analysis
    • Trans-QTL Analysis with TensorQTL
    • SuSiE Finemapping with TensorQTL
      • Cis-SuSiE Finemapping
  • Advanced Usage: Dry Run and Command Generation
    • JaxQTL Command Generation
    • TensorQTL Command Generation
      • Saving Commands to Files

By Jan Engelmann, Lucas Arnoldt, Eva Holtkamp

© Copyright 2026, Theislab..