ResearchHub | Open Science Community

Epigenome-wide association meta-analysis of DNA methylation with coffee and tea consumption

Irma Karabegović et al.May 14, 2021

Abstract Coffee and tea are extensively consumed beverages worldwide which have received considerable attention regarding health. Intake of these beverages is consistently linked to, among others, reduced risk of diabetes and liver diseases; however, the mechanisms of action remain elusive. Epigenetics is suggested as a mechanism mediating the effects of dietary and lifestyle factors on disease onset. Here we report the results from epigenome-wide association studies (EWAS) on coffee and tea consumption in 15,789 participants of European and African-American ancestries from 15 cohorts. EWAS meta-analysis of coffee consumption reveals 11 CpGs surpassing the epigenome-wide significance threshold ( P -value <1.1×10 −7 ), which annotated to the AHRR , F2RL3 , FLJ43663 , HDAC4 , GFI1 and PHGDH genes. Among them, cg14476101 is significantly associated with expression of the PHGDH and risk of fatty liver disease. Knockdown of PHGDH expression in liver cells shows a correlation with expression levels of genes associated with circulating lipids, suggesting a role of PHGDH in hepatic-lipid metabolism. EWAS meta-analysis on tea consumption reveals no significant association, only two CpGs annotated to CACNA1A and PRDM16 genes show suggestive association ( P -value <5.0×10 −6 ). These findings indicate that coffee-associated changes in DNA methylation levels may explain the mechanism of action of coffee consumption in conferring risk of diseases.

Genetics

Pharmacology

5

Paper

Save

A Saturated Map of Common Genetic Variants Associated with Human Height from 5.4 Million Individuals of Diverse Ancestries

Loïc Yengo et al.Jan 10, 2022

ABSTRACT Common SNPs are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes. Here we show, using GWAS data from 5.4 million individuals of diverse ancestries, that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a median size of ~90 kb, covering ~21% of the genome. The density of independent associations varies across the genome and the regions of elevated density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs account for 40% of phenotypic variance in European ancestry populations but only ~10%-20% in other ancestries. Effect sizes, associated regions, and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely explained by linkage disequilibrium and allele frequency differences within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than needed to implicate causal genes and variants. Overall, this study, the largest GWAS to date, provides an unprecedented saturated map of specific genomic regions containing the vast majority of common height-associated variants.

Genetics

Biology

3

Paper

Save

Clonal hematopoiesis is driven by aberrant activation of TCL1A

Joshua Weinstock et al.Dec 13, 2021

Abstract A diverse set of driver genes, such as regulators of DNA methylation, RNA splicing, and chromatin remodeling, have been associated with pre-malignant clonal expansion of hematopoietic stem cells (HSCs). The factors mediating expansion of these mutant clones remain largely unknown, partially due to a paucity of large cohorts with longitudinal blood sampling. To circumvent this limitation, we developed and validated a method to infer clonal expansion rate from single timepoint data called PACER (passenger-approximated clonal expansion rate). Applying PACER to 5,071 persons with clonal hematopoiesis accurately recapitulated the known fitness effects due to different driver mutations. A genome-wide association study of PACER revealed that a common inherited polymorphism in the TCL1A promoter was associated with slower clonal expansion. Those carrying two copies of this protective allele had up to 80% reduced odds of having driver mutations in TET2, ASXL1, SF3B1, SRSF2 , and JAK2 , but not DNMT3A. TCL1A was not expressed in normal or DNMT3A -mutated HSCs, but the introduction of mutations in TET2 or ASXL1 by CRISPR editing led to aberrant expression of TCL1A and expansion of HSCs in vitro. These effects were abrogated in HSCs from donors carrying the protective TCL1A allele. Our results indicate that the fitness advantage of multiple common driver genes in clonal hematopoiesis is mediated through TCL1A activation. PACER is an approach that can be widely applied to uncover genetic and environmental determinants of pre-malignant clonal expansion in blood and other tissues.

Genetics

Hematology

1

Paper

Save

A system for phenotype harmonization in the NHLBI Trans-Omics for Precision Medicine (TOPMed) Program

Adrienne Stilp et al.Jun 20, 2020

Genotype-phenotype association studies often combine phenotype data from multiple studies to increase power. Harmonization of the data usually requires substantial effort due to heterogeneity in phenotype definitions, study design, data collection procedures, and data set organization. Here we describe a centralized system for phenotype harmonization that includes input from phenotype domain and study experts, quality control, documentation, reproducible results, and data sharing mechanisms. This system was developed for the National Heart, Lung and Blood Institute’s Trans-Omics for Precision Medicine (TOPMed) program, which is generating genomic and other omics data for >80 studies with extensive phenotype data. To date, 63 phenotypes have been harmonized across thousands of participants from up to 17 TOPMed studies per phenotype. We discuss the challenges faced in this undertaking and how they were addressed. The harmonized phenotype data and associated documentation have been submitted to National Institutes of Health data repositories for controlled-access by the scientific community. We also provide materials to facilitate future harmonization efforts by the community, which include (1) the code used to generate the 63 harmonized phenotypes, enabling others to reproduce, modify or extend these harmonizations to additional studies; and (2) results of labeling thousands of phenotype variables with controlled vocabulary terms.

Genetics

Molecular Biology

0

Paper

Save

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations

Madeline Kowalski et al.Jul 2, 2019

Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are still limited. In addition to the limited inclusion of these populations in genetic studies, these populations have more complex linkage disequilibrium structure that may reduce the number of variants associated with a phenotype. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with commercial genome-wide genotyping array data. We demonstrate that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhances gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3 to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels, respectively. Impressively, even for extremely rare variants with sample minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~20,000 self-identified African descent individuals and ~23,000 self-identified Hispanic/Latino individuals identified associations with two rare variants in the HBB gene (rs33930165 with higher WBC (p=8.1×10−12) in African populations, rs11549407 with lower HGB (p=1.59×10−12) and HCT (p=1.13×10−9) in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of TOPMed imputation reference panel for identification of novel associations between rare variants and complex traits not previously detected in similar sized genome-wide studies of under-represented African and Hispanic/Latino populations.Author summary Admixed African and Hispanic/Latino populations remain understudied in genome-wide association and fine-mapping studies of complex diseases. These populations have more complex linkage disequilibrium (LD) structure that can impair mapping of variants associated with complex diseases and their risk factors. Genotype imputation represents an approach to improve genome coverage, especially for rare or ancestry-specific variation; however, these understudied populations also have smaller relevant imputation reference panels that need to be expanded to represent their more complex LD patterns. In this study, we leveraged >100,000 phased sequences generated from the multi-ethnic NHLBI TOPMed project to impute in admixed cohorts encompassing ~20,000 individuals of African ancestry (AAs) and ~23,000 Hispanics/Latinos. We demonstrated substantially higher imputation quality for low frequency and rare variants in comparison to the state-of-the-art reference panels (1000 Genomes Project and Haplotype Reference Consortium). Association analyses of ~35 million (AAs) and ~27 million (Hispanics/Latinos) variants passing stringent post-imputation filtering with quantitative hematological traits led to the discovery of associations with two rare variants in the HBB gene; one of these variants was replicated in an independent sample, and the other is known to cause anemia in the homozygous state. By comparison, the same HBB variants would not have been genome-wide significant using other state-of-the-art reference panels due to lower imputation quality. Our findings demonstrate the power of the TOPMed whole genome sequencing data for imputation and subsequent association analysis in admixed African and Hispanic/Latino populations.

Genetics

Biology

0

Paper

Save

Novel genetic determinants of telomere length from a multi-ethnic analysis of 75,000 whole genome sequences in TOPMed

Margaret Taub et al.Sep 4, 2019

Telomeres shorten in replicating somatic cells and with age; in human leukocytes, telomere length (TL) is associated with a host of aging-related diseases. To date, 16 genome-wide association studies (GWAS) have identified twenty-three loci associated with leukocyte TL, but prior studies were primarily in individuals of European and Asian ancestry and relied on laboratory assays including Southern Blot and qPCR to quantify TL. Here, we estimated TL bioinformatically, leveraging whole genome sequencing (WGS) of whole blood from n=75,176 subjects in the Trans-Omics for Precision Medicine (TOPMed) Program. We performed the largest multi-ethnic and only WGS-based genome-wide association analysis of TL to date. We identified 22 associated loci (p-value <5x10-8), including 10 novel loci. Three of the novel loci map to genes involved in telomere maintenance and/or DNA damage repair: TERF2, RFWD3, and SAMHD1. Many of the 99 pathways identified in gene set enrichment analysis for the 22 loci (multiple-testing corrected false discovery rate (FDR) <0.05) pertain to telomere biology, including the top five (FDR<1x10-9). Importantly, several loci, including the recently identified TINF2 and ATM loci, showed strong ancestry-specific associations.

Genetics

Physiology

0

Paper

Save

Epigenome-wide association meta-analysis of DNA methylation with coffee and tea consumption

Irma Karabegović et al.Apr 15, 2020

Coffee and tea are extensively consumed beverages worldwide. Observational studies have shown contradictory findings for the association between consumption of these beverages and different health outcomes. Epigenetics is suggested as a mechanism mediating the effects of dietary and lifestyle factors on disease onset. We conducted epigenome-wide association studies (EWAS) on coffee and tea consumptions in 15,789 participants of European and African-American ancestries from 15 cohorts. EWAS meta-analysis revealed 11 CpG sites significantly associated with coffee consumption (P-value <1.1*10-7), nine of them annotated to the genes AHRR, F2RL3, FLJ43663, HDAC4, GFI1 and PHGDH, and two CpGs suggestively associated with tea consumption (P-value<5.0*10-6). Among these, cg14476101 was significantly associated with expression of its annotated gene PHGDH and risk of fatty liver disease. Knockdown of PHGDH expression in liver cells showed a correlation with expression levels of lipid-associated genes, suggesting a role of PHGDH in hepatic-lipid metabolism. Collectively, this study indicates that coffee consumption is associated with differential DNA methylation levels at multiple CpGs, and that coffee-associated epigenetic variations may explain the mechanism of action of coffee consumption in conferring disease risk.### Competing Interest StatementThe authors have declared no competing interest.

Genetics

Molecular Biology

0

Paper

Save

Evaluation of the causal effect of fibrinogen on incident coronary heart disease via Mendelian randomization

Cavin Ward‐Caviness et al.Oct 19, 2018

Background: Fibrinogen is an essential hemostatic factor and cardiovascular disease risk factor. Early attempts at evaluating the causal effect of fibrinogen on coronary heart disease (CHD) and myocardial infraction (MI) using Mendelian randomization (MR) used single variant approaches, and did not take advantage of recent genome-wide association studies (GWAS) or multi-variant, pleiotropy robust MR methodologies. Methods and Findings: We evaluated evidence for a causal effect of fibrinogen on both CHD and MI using MR. We used both an allele score approach and pleiotropy robust MR models. The allele score was composed of 38 fibrinogen-associated variants from recent GWAS. Initial analyses using the allele score incorporated data from 11 European-ancestry prospective cohorts to examine incidence CHD and MI. We also applied 2 sample MR methods with data from a prevalent CHD and MI GWAS. Results are given in terms of the hazard ratio (HR) or odds ratio (OR), depending on the study design, and associated 95% confidence interval (CI). In single variant analyses no causal effect of fibrinogen on CHD or MI was observed. In multi-variant analyses using incidence CHD cases and the allele score approach, the estimated causal effect (HR) of a 1 g/L higher fibrinogen concentration was 1.62 (CI = 1.12, 2.36) when using incident cases and the allele score approach. In 2 sample MR analyses that accounted for pleiotropy, the causal estimate (odds ratio) was reduced to 1.18 (CI = 0.98, 1.42) and 1.09 (CI = 0.89, 1.33) in the 2 most precise (smallest CI) models, out of 4 models evaluated. In the 2 sample MR analyses for MI, there was only very weak evidence of a causal effect in only 1 out of 4 models. Conclusions: A small causal effect of fibrinogen on CHD is observed using multi-variant MR approaches which account for pleiotropy, but not single variant MR approaches. Taken together, results indicate that even with large sample sizes and multi-variant approaches, MR analyses still cannot exclude the null when estimating the causal effect of fibrinogen on CHD, but that any potential causal effect is likely to be much smaller than observed in epidemiological studies.

Genetics

Internal Medicine

0

Paper

Save

Genome-wide association study provides new insights into the genetic architecture and pathogenesis of heart failure

Sonia Shah et al.Jul 10, 2019

Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report the largest GWAS meta-analysis of HF to-date, comprising 47,309 cases and 930,014 controls. We identify 12 independent associations with HF at 11 genomic loci, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function suggesting shared genetic aetiology. Expression quantitative trait analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homeostasis (BAG3), and cellular senescence (CDKN1A). Using Mendelian randomisation analysis we provide new evidence supporting previously equivocal causal roles for several HF risk factors identified in observational studies, and demonstrate CAD-independent effects for atrial fibrillation, body mass index, hypertension and triglycerides. These findings extend our knowledge of the genes and pathways underlying HF and may inform the development of new therapeutic approaches.

Genetics

Molecular Biology

0

Paper

Genetics

Molecular Biology

0

Save