cellink.tl.run_burden_test#
- cellink.tl.run_burden_test(G, Y, F, gene, annotation_cols, burden_agg_fct='sum', run_lrt=True)#
Perform a burden test for genetic association analysis.
- Parameters:
G (AnnData) – An AnnData object containing genotype data. The
Xattribute should contain the genotype matrix, and thevarm["variant_annotation"]attribute should contain variant annotations.Y (np.ndarray) – Phenotype data for the samples.
F (np.ndarray) – Covariate matrix for the samples.
gene (str) – The name of the gene being tested.
annotation_cols (list of str) – List of column names in
G.varm["variant_annotation"]to use for calculating variant scores.burden_agg_fct (str (default="sum")) – Aggregation function to compute the burden score. Options include “sum”, “mean”, “max”.
run_lrt (bool (default=True)) – Whether to compute the likelihood ratio test (LRT) for the burden test.
- Returns:
-rdf (
DataFrame) A DataFrame containing the results of the burden test with the following columns: - “burden_gene”: The gene name whose burden was used. - “egene”: The gene name that was tested (expression, Y). - “weight_col”: The annotation columns used for the burden test. - “burden_agg_fct”: The aggregation function used. - “pv”: P-values from the GWAS analysis. - “beta”: Effect sizes from the GWAS analysis. - “betaste”: Standard errors of the effect sizes. - “lrt” (ifrun_lrtis True): LRT statistics from the GWAS analysis.