Single-cell Genetics Package (Cellink)#
Welcome to the official documentation for cellink—the toolkit designed to bridge the gap between single-cell data and individual-level genetic analysis.
Motivation#
Integrating genetic data with cellular heterogeneity is crucial for advancing personalized medicine. cellink provides the missing framework for efficiently handling and analyzing genetic variation alongside complex single-cell omics data at scale.
✨ Key Features & Structure#
cellink introduces the DonorData class, unifying individual-level and single-cell data. It extends standard formats (AnnData, MuData) with GenoAnnData for efficient genotype (via dask) and phenotype (via ehrapy) handling.

Donor-level Data (G):
GenoAnnData, Stores individual level data such as genotypes.Cell-level Data (C):
AnnData/MuData, Stores single-cell omics data such as gene expression.
Crucially, DonorData ensures that genetic data and single-cell modalities remain synchronized, preserving their donor-cell pairing even through complex filtering operations (e.g., selecting specific cell types or patient subsets).
2. Comprehensive Toolkit#
cellink offers a streamlined suite of tools for the entire analysis workflow:
Variant Preprocessing & Annotation: Tools for quality control, annotation (VCF export/import), and selection of genetic variants.
Specialized Downstream Analysis: Easily perform complex genetic analyses on single-cell expression data, including:
Interoperability: cellink enhances standard workflows through data exports compatible with common genetic analysis tools, e.g., for eQTL analysis with jaxqtl or tensorqtl, eQTL analysis with SAIGEQTL and includes built-in dataloaders for deep learning.
🚀 Getting Started#
Check out the Tutorials section for step-by-step guides on analysis workflows.
Explore the API Reference for detailed documentation.
Install the latest development version directly from GitHub:
pip install git+https://github.com/theislab/cellink.git@main
Contact#
If you found a bug, please use the issue tracker.