cellink.resources.get_1000genomes_ld_weights#
- cellink.resources.get_1000genomes_ld_weights(config_path='./cellink/resources/config/1000genomes.yaml', population='EUR', data_home=None, return_path=False, refresh=False)#
Download, extract, and load precomputed 1000 Genomes LD weights.
This function downloads population-specific LD weights from the 1000 Genomes project, extracts them to a local directory, and concatenates chromosome-wise weight files into a single pandas DataFrame.
- Parameters:
config_path (str or pathlib.Path, default='./cellink/resources/config/1000genomes.yaml') – Path to YAML configuration file specifying URLs and file names for LD weights.
population (str, default='EUR') – Population code for LD weights. Must be one of {‘EUR’, ‘EAS’}.
data_home (str or pathlib.Path, optional) – Root directory where data will be stored. Defaults to user-specific cache directory.
return_path (bool, default=False) – If True, returns the path to the extracted files and file prefix instead of a DataFrame.
refresh (bool, default=False) – If True, re-downloads and re-extracts files even if they already exist locally.
- Return type:
- Returns:
tuple If
return_path=False, returns(None, weights): - None : placeholder for compatibility with LD scores interface. - weights : pd.DataFrameConcatenated LD weight files for all chromosomes.
If
return_path=True, returns(DATA, prefix): - DATA : pathlib.PathPath to the directory containing extracted files.
- prefixstr
File name prefix used in the extracted data.
- Raises:
ValueError – If
populationis not one of the populations listed in the configuration.