cellink.tl.external.run_seismic#
- cellink.tl.external.run_seismic(adata, magma_file, cell_type_col, n_pcs=50, species='human', min_genes=250, min_cells=50, influential_genes=False, influential_cell_types=None, top_n_associations=20, prefix=None, save_results=True, plot_associations=True, plot_influential=True)#
Run seismic analysis to link cell types with GWAS traits using single-cell data.
seismic (Single-cEll dIsease-relevance statIstical testing via Multi-resolution Cell-type specificity) identifies cell type-trait associations and influential genes driving these associations.
- Parameters:
adata (AnnData) – AnnData object containing single-cell expression data.
magma_file (str or Path) – Path to MAGMA gene-level summary statistics file.
cell_type_col (str) – Column name in adata.obs containing cell type annotations.
n_pcs (int, default=50) – Number of principal components to compute if not already present.
species ({'human', 'mouse'}, default='human') – Species of the single-cell data.
min_genes (int, default=250) – Minimum number of genes for cell filtering.
min_cells (int, default=50) – Minimum number of cells for gene filtering.
influential_genes (bool, default=False) – Whether to compute influential gene analysis for significant cell type-trait pairs.
influential_cell_types (list of str, optional) – Specific cell types to analyze for influential genes. If None and influential_genes=True, analyzes all significant cell types.
top_n_associations (int, default=20) – Number of top associations to plot.
prefix (str, optional) – Prefix for output files. Default is “seismic”.
save_results (bool, default=True) – Whether to save results to files.
plot_associations (bool, default=True) – Whether to plot top associations.
plot_influential (bool, default=True) – Whether to plot influential genes (if computed).
- Return type:
- Returns:
pd.DataFrame or tuple If influential_genes=False: DataFrame with cell type-trait associations If influential_genes=True: tuple of (associations_df, dict of influential_genes_dfs)
- Raises:
RuntimeError – If R or required R packages are not available.
ValueError – If cell_type_col is not in adata.obs.
Examples
>>> # Basic seismic analysis >>> associations = run_seismic( ... dd, ... magma_file="trait.genes.out", ... cell_type_col="cell_type", ... )
>>> # With influential gene analysis >>> associations, influential = run_seismic( ... dd, ... magma_file="trait.genes.out", ... cell_type_col="cell_type", ... influential_genes=True, ... )