ResearchHub | Open Science Community

Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis

Miriam Udler et al.Sep 21, 2018

Type 2 diabetes (T2D) is a heterogeneous disease for which (1) disease-causing pathways are incompletely understood and (2) subclassification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper, we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four separate subsets of individuals with T2D.In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization (bNMF) clustering to genome-wide association study (GWAS) results for 94 independent T2D genetic variants and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta cell function, differing from each other by high versus low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity mediated (high body mass index [BMI] and waist circumference [WC]), "lipodystrophy-like" fat distribution (low BMI, adiponectin, and high-density lipoprotein [HDL] cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster genetic risk scores were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease (CAD), and stroke. We evaluated the potential for clinical impact of these clusters in four studies containing individuals with T2D (Metabolic Syndrome in Men Study [METSIM], N = 487; Ashkenazi, N = 509; Partners Biobank, N = 2,065; UK Biobank [UKBB], N = 14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with approximately 30% of all individuals assigned to just one cluster top decile. Limitations of this study include that the genetic variants used in the cluster analysis were restricted to those associated with T2D in populations of European ancestry.Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports the use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.

Genetics

Molecular Biology

1

Paper

Save

Refining the accuracy of validated target identification through coding variant fine-mapping in type 2 diabetes

Anubha Mahajan et al.May 31, 2017

Identification of coding variant associations for complex diseases offers a direct route to biological insight, but is dependent on appropriate inference concerning the causal impact of those variants on disease risk. We aggregated coding variant data for 81,412 type 2 diabetes (T2D) cases and 370,832 controls of diverse ancestry, identifying 40 distinct coding variant association signals (at 38 loci) reaching significance ( p <2.2×10 −7 ). Of these, 16 represent novel associations mapping outside known genome-wide association study (GWAS) signals. We make two important observations. First, despite a threefold increase in sample size over previous efforts, only five of the 40 signals are driven by variants with minor allele frequency <5%, and we find no evidence for low-frequency variants with allelic odds ratio >1.29. Second, we used GWAS data from 50,160 T2D cases and 465,272 controls of European ancestry to fine-map these associated coding variants in their regional context, with and without additional weighting to account for the global enrichment of complex trait association signals in coding exons. At the 37 signals for which we attempted fine-mapping, we demonstrate convincing support (posterior probability >80% under the “annotation-weighted” model) that coding variants are causal for the association at 16 (including novel signals involving POC5 p.His36Arg, ANKH p.Arg187Gln, WSCD2 p.Thr113Ile, PLCB3 p.Ser778Leu, and PNPLA3 p.Ile148Met). However, at 13 of the 37 loci, the associated coding variants represent “false leads” and naïve analysis could have led to an erroneous inference regarding the effector transcript mediating the signal. Accurate identification of validated targets is dependent on correct specification of the contribution of coding and non-coding mediated mechanisms at associated loci.

Genetics

Molecular Biology

0

Paper

Save

Tissue-Specific Alteration of Metabolic Pathways Influences Glycemic Regulation

Natasha Ng et al.Oct 3, 2019

Summary Metabolic dysregulation in multiple tissues alters glucose homeostasis and influences risk for type 2 diabetes (T2D). To identify pathways and tissues influencing T2D-relevant glycemic traits (fasting glucose [FG], fasting insulin [FI], two-hour glucose [2hGlu] and glycated hemoglobin [HbA1c]), we investigated associations of exome-array variants in up to 144,060 individuals without diabetes of multiple ancestries. Single-variant analyses identified novel associations at 21 coding variants in 18 novel loci, whilst gene-based tests revealed signals at two genes, TF (HbA1c) and G6PC (FG, FI). Pathway and tissue enrichment analyses of trait-associated transcripts confirmed the importance of liver and kidney for FI and pancreatic islets for FG regulation, implicated adipose tissue in FI and the gut in 2hGlu, and suggested a role for the non-endocrine pancreas in glucose homeostasis. Functional studies demonstrated that a novel FG/FI association at the liver-enriched G6PC transcript was driven by multiple rare loss-of-function variants. The FG/HbA1c-associated, islet-specific G6PC2 transcript also contained multiple rare functional variants, including two alleles within the same codon with divergent effects on glucose levels. Our findings highlight the value of integrating genomic and functional data to maximize biological inference. Highlights 23 novel coding variant associations (single-point and gene-based) for glycemic traits 51 effector transcripts highlighted different pathway/tissue signatures for each trait The exocrine pancreas and gut influence fasting and 2h glucose, respectively Multiple variants in liver-enriched G6PC and islet-specific G6PC2 influence glycemia

Genetics

Molecular Biology

0

Paper

Save

Complement component 4 genes contribute sex-specific vulnerability in diverse illnesses

Nolan Kamitaki et al.Sep 9, 2019

Many common illnesses differentially affect men and women for unknown reasons. The autoimmune diseases lupus and Sjögren’s syndrome affect nine times more women than men 1,2 , whereas schizophrenia affects men more frequently and severely 3–5 . All three illnesses have their strongest common-genetic associations in the Major Histocompatibility Complex (MHC) locus, an association that in lupus and Sjögren’s syndrome has long been thought to arise from HLA alleles 6–13 . Here we show that the complement component 4 ( C4 ) genes in the MHC locus, recently found to increase risk for schizophrenia 14 , generate 7-fold variation in risk for lupus (95% CI: 5.88-8.61; p < 10 −117 in total) and 16-fold variation in risk for Sjögren’s syndrome (95% CI: 8.59-30.89; p < 10 −23 in total), with C4A protecting more strongly than C4B in both illnesses. The same alleles that increase risk for schizophrenia, greatly reduced risk for lupus and Sjögren’s syndrome. In all three illnesses, C4 alleles acted more strongly in men than in women: common combinations of C4A and C4B generated 14-fold variation in risk for lupus and 31-fold variation in risk for Sjögren’s syndrome in men (vs. 6-fold and 15-fold among women respectively) and affected schizophrenia risk about twice as strongly in men as in women. At a protein level, both C4 and its effector (C3) were present at greater levels in men than women in cerebrospinal fluid ( p < 10 −5 for both C4 and C3) and plasma among adults ages 20-50 15–17 , corresponding to the ages of differential disease vulnerability. Sex differences in complement protein levels may help explain the larger effects of C4 alleles in men, women’s greater risk of SLE and Sjögren’s, and men’s greater vulnerability in schizophrenia. These results nominate the complement system as a source of sexual dimorphism in vulnerability to diverse illnesses.

Genetics

Immunology

0

Paper

Save

Clustering of Type 2 Diabetes Genetic Loci by Multi-Trait Associations Identifies Disease Mechanisms and Subtypes

Miriam Udler et al.May 10, 2018

Type 2 diabetes (T2D) is a heterogeneous disease for which 1) disease-causing pathways are incompletely understood and 2) sub-classification may improve patient management. Unlike other biomarkers, germline genetic markers do not change with disease progression or treatment. In this paper we test whether a germline genetic approach informed by physiology can be used to deconstruct T2D heterogeneity. First, we aimed to categorize genetic loci into groups representing likely disease mechanistic pathways. Second, we asked whether the novel clusters of genetic loci we identified have any broad clinical consequence, as assessed in four independent cohorts of individuals with T2D. In an effort to identify mechanistic pathways driven by established T2D genetic loci, we applied Bayesian nonnegative matrix factorization clustering to genome-wide association results for 94 independent T2D genetic loci and 47 diabetes-related traits. We identified five robust clusters of T2D loci and traits, each with distinct tissue-specific enhancer enrichment based on analysis of epigenomic data from 28 cell types. Two clusters contained variant-trait associations indicative of reduced beta-cell function, differing from each other by high vs. low proinsulin levels. The three other clusters displayed features of insulin resistance: obesity-mediated (high BMI, waist circumference), "lipodystrophy-like" fat distribution (low BMI, adiponectin, HDL-cholesterol, and high triglycerides), and disrupted liver lipid metabolism (low triglycerides). Increased cluster GRS's were associated with distinct clinical outcomes, including increased blood pressure, coronary artery disease, and stroke risk. We evaluated the potential for clinical impact of these clusters in four studies containing participants with T2D (METSIM, N=487; Ashkenazi, N=509; Partners Biobank, N=2,065; UK Biobank N=14,813). Individuals with T2D in the top genetic risk score decile for each cluster reproducibly exhibited the predicted cluster-associated phenotypes, with ~30% of all participants assigned to just one cluster top decile. Our approach identifies salient T2D genetically anchored and physiologically informed pathways, and supports use of genetics to deconstruct T2D heterogeneity. Classification of patients by these genetic pathways may offer a step toward genetically informed T2D patient management.

Genetics

Molecular Biology

0

Paper

Save

Identification of type 2 diabetes loci in 433,540 East Asian individuals

Cassandra Spracklen et al.Jun 28, 2019

Meta-analyses of genome-wide association studies (GWAS) have identified >240 loci associated with type 2 diabetes (T2D), however most loci have been identified in analyses of European-ancestry individuals. To examine T2D risk in East Asian individuals, we meta-analyzed GWAS data in 77,418 cases and 356,122 controls. In the main analysis, we identified 298 distinct association signals at 178 loci, and across T2D association models with and without consideration of body mass index and sex, we identified 56 loci newly implicated in T2D predisposition. Common variants associated with T2D in both East Asian and European populations exhibited strongly correlated effect sizes. New associations include signals in/near GDAP1 , PTF1A , SIX3, ALDH2, a microRNA cluster, and genes that affect muscle and adipose differentiation. At another locus, eQTLs at two overlapping T2D signals act through two genes, NKX6-3 and ANK1 , in different tissues. Association studies in diverse populations identify additional loci and elucidate disease genes, biology, and pathways.Type 2 diabetes (T2D) is a common metabolic disease primarily caused by insufficient insulin production and/or secretion by the pancreatic β cells and insulin resistance in peripheral tissues[1][1]. Most genetic loci associated with T2D have been identified in populations of European (EUR) ancestry, including a recent meta-analysis of genome-wide association studies (GWAS) of nearly 900,000 individuals of European ancestry that identified >240 loci influencing the risk of T2D[2][2]. Differences in allele frequency between ancestries affect the power to detect associations within a population, particularly among variants rare or monomorphic in one population but more frequent in another[3][3],[4][4]. Although smaller than studies in European populations, a recent T2D meta-analysis in almost 200,000 Japanese individuals identified 28 additional loci[4][4]. The relative contributions of different pathways to the pathophysiology of T2D may also differ between ancestry groups. For example, in East Asian (EAS) populations, T2D prevalence is greater than in European populations among people of similar body mass index (BMI) or waist circumference[5][5]. We performed the largest meta-analysis of East Asian individuals to identify new genetic associations and provide insight into T2D pathogenesis. [1]: #ref-1 [2]: #ref-2 [3]: #ref-3 [4]: #ref-4 [5]: #ref-5

Genetics

Molecular Biology

0

Paper

Save

Analysis of protein-coding genetic variation in 60,706 humans

Olle Melander et al.Oct 30, 2015

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) sequence data for 60,706 individuals of diverse ethnicities generated as part of the Exome Aggregation Consortium (ExAC). The resulting catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We show that this catalogue can be used to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; we identify 3,230 genes with near-complete depletion of truncating variants, 72% of which have no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human knockout variants in protein-coding genes.

Genetics

Biology

0

Paper

Save

Integrating Comprehensive Functional Annotations to Boost Power and Accuracy in Gene-Based Association Analysis

Corbin Quick et al.Aug 12, 2019

Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.

Genetics

Molecular Biology

0

Paper

Save

Narrow-sense heritability estimation of complex traits using identity-by-descent information.

Luke Evans et al.Jul 17, 2017

Heritability is a fundamental parameter in genetics. Traditional estimates based on family or twin studies can be biased due to shared environmental or non-additive genetic variance. Alternatively, those based on genotyped or imputed variants typically underestimate narrow-sense heritability contributed by rare or otherwise poorly-tagged causal variants. Identical-by-descent (IBD) segments of the genome share all variants between pairs of chromosomes except new mutations that have arisen since the last common ancestor. Therefore, relating phenotypic similarity to degree of IBD sharing among classically unrelated individuals is an appealing approach to estimating the near full additive genetic variance while avoiding biases that can occur when modeling close relatives. We applied an IBD-based approach (GREML-IBD) to estimate heritability in unrelated individuals using phenotypic simulation with thousands of whole genome sequences across a range of stratification, polygenicity levels, and the minor allele frequencies of causal variants (CVs). IBD-based heritability estimates were unbiased when using unrelated individuals, even for traits with extremely rare CVs, but stratification led to strong biases in IBD-based heritability estimates with poor precision. We used data on two traits in ~120,000 people from the UK Biobank to demonstrate that, depending on the trait and possible confounding environmental effects, GREML-IBD can be applied successfully to very large genetic datasets to infer the contribution of very rare variants lost using other methods. However, we observed apparent biases in this real data that were not predicted from our simulation, suggesting that more work may be required to understand factors that influence IBD-based estimates.

Genetics

Biology

0

Paper

Save

Subset-Based Analysis using Gene-Environment Interactions for Discovery of Genetic Associations across Multiple Studies or Phenotypes

Youfei Yu et al.May 23, 2018

Objectives: Classical methods for combining summary data from genome-wide association studies (GWAS) only use marginal genetic effects and power can be compromised in the presence of heterogeneity. We aim to enhance the discovery of novel associated loci in the presence of heterogeneity of genetic effects in sub-groups defined by an environmental factor. Methods: We present a p-value Assisted Subset Testing for Associations (pASTA) framework that generalizes the previously proposed association analysis based on subsets (ASSET) method by incorporating gene-environment (G-E) interactions into the testing procedure. We conduct simulation studies and provide two data examples. Results: Simulation studies show that our proposal is more powerful than methods based on marginal associations in the presence of G-E interactions and maintains comparable power even in their absence. Both data examples demonstrate that our method can increase power to detect overall genetic associations and identify novel studies/phenotypes that contribute to the association. Conclusions: Our proposed method can be a useful screening tool to identify candidate single nucleotide polymorphisms (SNPs) that are potentially associated with the trait(s) of interest for further validation. It also allows researchers to determine the most probable subset of traits that exhibit genetic associations in addition to the enhancement of power.

Genetics

Biology

0

Paper

Genetics

Biology

0

Save