ResearchHub | Open Science Community

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations

Madeline Kowalski et al.Dec 23, 2019

Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are limited. In addition, these populations have more complex linkage disequilibrium structure. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with genome-wide genotyping array data. We demonstrated that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhanced gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3- to 6.1-fold increase in the number of well-imputed variants, with 11–34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels. Impressively, even for extremely rare variants with minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~21,600 African-ancestry and ~21,700 Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC [p = 8.8x10-15] in African populations, rs11549407 with lower HGB [p = 1.5x10-12] and HCT [p = 8.8x10-10] in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of the TOPMed imputation reference panel for identification of novel rare variant associations not previously detected in similarly sized genome-wide studies of under-represented African and Hispanic/Latino populations.

Genetics

Biology

0

Paper

Save

A multi-layer functional genomic analysis to understand noncoding genetic variation in lipids

Shweta Ramdas et al.Dec 8, 2021

Abstract A major challenge of genome-wide association studies (GWAS) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations, and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels, and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. Two prioritized genes, CREBRF and RRBP1 , show convergent evidence across functional datasets supporting their roles in lipid biology.

Genetics

Molecular Biology

57

Paper

Save

Single-Ancestry versus Multi-Ancestry Polygenic Risk Scores for CKD in Black American Populations

Alana Jones et al.Jul 29, 2024

Key Points The predictive performance of an African ancestry–specific polygenic risk score (PRS) was comparable to a European ancestry–derived PRS for kidney traits. However, multi-ancestry PRSs outperform single-ancestry PRSs in Black American populations. Predictive accuracy of PRSs for CKD was improved with the use of race-free eGFR. Background CKD is a risk factor of cardiovascular disease and early death. Recently, polygenic risk scores (PRSs) have been developed to quantify risk for CKD. However, African ancestry populations are underrepresented in both CKD genetic studies and PRS development overall. Moreover, European ancestry–derived PRSs demonstrate diminished predictive performance in African ancestry populations. Methods This study aimed to develop a PRS for CKD in Black American populations. We obtained score weights from a meta-analysis of genome-wide association studies for eGFR in the Million Veteran Program and Reasons for Geographic and Racial Differences in Stroke Study to develop an eGFR PRS. We optimized the PRS risk model in a cohort of participants from the Hypertension Genetic Epidemiology Network. Validation was performed in subsets of Black participants of the Trans-Omics in Precision Medicine Consortium and Genetics of Hypertension Associated Treatment Study. Results The prevalence of CKD—defined as stage 3 or higher—was associated with the PRS as a continuous predictor (odds ratio [95% confidence interval]: 1.35 [1.08 to 1.68]) and in a threshold-dependent manner. Furthermore, including APOL1 risk status—a putative variant for CKD with higher prevalence among those of sub-Saharan African descent—improved the score's accuracy. PRS associations were robust to sensitivity analyses accounting for traditional CKD risk factors, as well as CKD classification based on prior eGFR equations. Compared with previously published PRS, the predictive performance of our PRS was comparable with a European ancestry–derived PRS for kidney traits. However, single-ancestry PRSs were less predictive than multi-ancestry–derived PRSs. Conclusions In this study, we developed a PRS that was significantly associated with CKD with improved predictive accuracy when including APOL1 risk status. However, PRS generated from multi-ancestry populations outperformed single-ancestry PRS in our study.

Genetics

Epidemiology

0

Paper

Save

A 6-CpG Validated Methylation Risk Score Model for Metabolic Syndrome: The HyperGEN and GOLDN Studies

Bertha Hidalgo et al.Oct 25, 2021

Abstract There has been great interest in genetic risk prediction using risk scores in recent years, however, the utility of scores developed in European populations and later applied to non-European populations has not been successful. In this study, we used cross-sectional data from the Hypertension Genetic Epidemiology Network (HyperGEN, N=614 African Americans (AA)) and the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN, N=995 European Americans (EA)), to create a methylation risk score (MRS) for metabolic syndrome (MetS), demonstrating the utility of MRS across race groups. To demonstrate this, we first selected cytosine-guanine dinucleotides (CpG) sites measured on Illumina Methyl450 arrays previously reported to be significantly associated with MetS and/or component conditions ( CPT1A cg00574958, PHOSPHO1 cg02650017, ABCG1 cg06500161, SREBF1 cg11024682, SOCS3 cg18181703, TXNIP cg19693031). Second, we calculated the parameter estimates for the 6 CpGs in the HyperGEN data and used the beta estimates as weights to construct a MRS in HyperGEN, which was validated in GOLDN. We performed association analyses using a logistic mixed model to test the association between the MRS and MetS adjusting for covariates. Results showed the MRS was significantly associated with MetS in both populations. In summary, a MRS for MetS was a strong predictor for the condition across two ethnic groups suggesting MRS may be useful to examine metabolic disease risk or related complications across ethnic groups.

Genetics

Oncology

1

Paper

Save

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations

Madeline Kowalski et al.Jul 2, 2019

Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are still limited. In addition to the limited inclusion of these populations in genetic studies, these populations have more complex linkage disequilibrium structure that may reduce the number of variants associated with a phenotype. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with commercial genome-wide genotyping array data. We demonstrate that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhances gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3 to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels, respectively. Impressively, even for extremely rare variants with sample minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~20,000 self-identified African descent individuals and ~23,000 self-identified Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC (p=8.1×10−12) in African populations, rs11549407 with lower HGB (p=1.59×10−12) and HCT (p=1.13×10−9) in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of TOPMed imputation reference panel for identification of novel associations between rare variants and complex traits not previously detected in similar sized genome-wide studies of under-represented African and Hispanic/Latino populations.Author summary Admixed African and Hispanic/Latino populations remain understudied in genome-wide association and fine-mapping studies of complex diseases. These populations have more complex linkage disequilibrium (LD) structure that can impair mapping of variants associated with complex diseases and their risk factors. Genotype imputation represents an approach to improve genome coverage, especially for rare or ancestry-specific variation; however, these understudied populations also have smaller relevant imputation reference panels that need to be expanded to represent their more complex LD patterns. In this study, we leveraged >100,000 phased sequences generated from the multi-ethnic NHLBI TOPMed project to impute in admixed cohorts encompassing ~20,000 individuals of African ancestry (AAs) and ~23,000 Hispanics/Latinos. We demonstrated substantially higher imputation quality for low frequency and rare variants in comparison to the state-of-the-art reference panels (1000 Genomes Project and Haplotype Reference Consortium). Association analyses of ~35 million (AAs) and ~27 million (Hispanics/Latinos) variants passing stringent post-imputation filtering with quantitative hematological traits led to the discovery of associations with two rare variants in the HBB gene; one of these variants was replicated in an independent sample, and the other is known to cause anemia in the homozygous state. By comparison, the same HBB variants would not have been genome-wide significant using other state-of-the-art reference panels due to lower imputation quality. Our findings demonstrate the power of the TOPMed whole genome sequencing data for imputation and subsequent association analysis in admixed African and Hispanic/Latino populations.

Genetics

Biology

0

Paper

Save

Impact of rare and common genetic variants on diabetes diagnosis by hemoglobin A1c in multi-ancestry cohorts: The Trans-Omics for Precision Medicine Program.

Chloé Sarnowski et al.May 28, 2019

Hemoglobin A1c (HbA1c) is widely used to diagnose diabetes and assess glycemic control in patients with diabetes. However, nonglycemic determinants, including genetic variation, may influence how accurately HbA1c reflects underlying glycemia. Analyzing the NHLBI Trans-Omics for Precision Medicine (TOPMed) sequence data in 10,338 individuals from five studies and four ancestries (6,158 Europeans, 3,123 African-Americans, 650 Hispanics and 407 East Asians), we confirmed five regions associated with HbA1c ( GCK in Europeans and African-Americans, HK1 in Europeans and Hispanics, FN3K / FN3KRP in Europeans and G6PD in African-Americans and Hispanics) and discovered a new African-ancestry specific low-frequency variant (rs1039215 in HBG2 / HBE1 , minor allele frequency (MAF)=0.03). The most associated G6PD variant (p.Val98Met, rs1050828-T, MAF=12% in African-Americans, MAF=2% in Hispanics) lowered HbA1c (-0.88% in hemizygous males, -0.34% in heterozygous females) and explained 23% of HbA1c variance in African-Americans and 4% in Hispanics. Additionally, we identified a rare distinct G6PD coding variant (rs76723693 - p.Leu353Pro, MAF=0.5%; -0.98% in hemizygous males, -0.46% in heterozygous females) and detected significant association with HbA1c when aggregating rare missense variants in G6PD . We observed similar magnitude and direction of effects for rs1039215 ( HBG2 ) and rs76723693 ( G6PD ) in the two largest TOPMed African-American cohorts and replicated the rs76723693 association in the UK Biobank African-ancestry participants. These variants in G6PD and HBG2 were monomorphic in the European and Asian samples. African or Hispanic ancestry individuals carrying G6PD variants may be underdiagnosed for diabetes when screened with HbA1c. Thus, assessment of these variants should be considered for incorporation into precision medicine approaches for diabetes diagnosis.

Genetics

Internal Medicine

0

Paper

Save

Novel genetic determinants of telomere length from a multi-ethnic analysis of 75,000 whole genome sequences in TOPMed

Margaret Taub et al.Sep 4, 2019

Telomeres shorten in replicating somatic cells and with age; in human leukocytes, telomere length (TL) is associated with a host of aging-related diseases. To date, 16 genome-wide association studies (GWAS) have identified twenty-three loci associated with leukocyte TL, but prior studies were primarily in individuals of European and Asian ancestry and relied on laboratory assays including Southern Blot and qPCR to quantify TL. Here, we estimated TL bioinformatically, leveraging whole genome sequencing (WGS) of whole blood from n=75,176 subjects in the Trans-Omics for Precision Medicine (TOPMed) Program. We performed the largest multi-ethnic and only WGS-based genome-wide association analysis of TL to date. We identified 22 associated loci (p-value <5x10-8), including 10 novel loci. Three of the novel loci map to genes involved in telomere maintenance and/or DNA damage repair: TERF2, RFWD3, and SAMHD1. Many of the 99 pathways identified in gene set enrichment analysis for the 22 loci (multiple-testing corrected false discovery rate (FDR) <0.05) pertain to telomere biology, including the top five (FDR<1x10-9). Importantly, several loci, including the recently identified TINF2 and ATM loci, showed strong ancestry-specific associations.

Genetics

Physiology

0

Paper

Save

Novel DNA methylation sites of glucose and insulin homeostasis: an integrative cross-omics analysis

Jun Liu et al.Oct 18, 2018

Despite existing reports on differential DNA methylation in type 2 diabetes (T2D) and obesity, our understanding of the functional relevance of the phenomenon remains limited. Because obesity is the main risk factor for T2D and a driver of methylation from previous study, we aimed to explore the effect of DNA methylation in the early phases of T2D pathology while accounting for body mass index (BMI). We performed a blood-based epigenome-wide association study (EWAS) of fasting glucose and insulin among 4,808 non-diabetic European individuals and replicated the findings in an independent sample consisting of 11,750 non-diabetic subjects. We integrated blood-based in silico cross-omics databases comprising genomics, epigenomics and transcriptomics collected by BIOS project of the Biobanking and BioMolecular resources Research Infrastructure of the Netherlands (BBMRI-NL), the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC), the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium, and the tissue-specific Genotype-Tissue Expression (GTEx) project. We identified and replicated nine novel differentially methylated sites in whole blood (P-value < 1.27 × 10-7): sites in LETM1, RBM20, IRS2, MAN2A2 genes and 1q25.3 region were associated with fasting insulin; sites in FCRL6, SLAMF1, APOBEC3H genes and 15q26.1 region were associated with fasting glucose. The association between SLAMF1, APOBEC3H and 15q26.1 methylation sites and glucose emerged only when accounted for BMI. Follow-up in silico cross-omics analyses indicate that the cis-acting meQTLs near SLAMF1 and SLAMF1 expression are involved in glucose level regulation. Moreover, our data suggest that differential methylation in FCRL6 may affect glucose level and the risk of T2D by regulating FCLR6 expression in the liver. In conclusion, the present study provided nine new DNA methylation sites associated with glycemia homeostasis and also provided new insights of glycemia related loci into the genetics, epigenetics and transcriptomics pathways based on the integration of cross-omics data in silico.

Genetics

Molecular Biology

0

Paper

Genetics

Molecular Biology

0

Save