ResearchHub | Open Science Community

FinnGen provides genetic insights from a well-phenotyped isolated population

Mitja Kurki et al.Jan 18, 2023

Abstract Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored 1,2 . FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10 –11 ) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Genetics

Biology

0

Paper

Save

Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies

Wei Zhou et al.Aug 8, 2018

In genome-wide association studies (GWAS) for thousands of phenotypes in large biobanks, most binary traits have substantially fewer cases than controls. Both of the widely used approaches, the linear mixed model and the recently proposed logistic mixed model, perform poorly; they produce large type I error rates when used to analyze unbalanced case-control phenotypes. Here we propose a scalable and accurate generalized mixed model association test that uses the saddlepoint approximation to calibrate the distribution of score test statistics. This method, SAIGE (Scalable and Accurate Implementation of GEneralized mixed model), provides accurate P values even when case-control ratios are extremely unbalanced. SAIGE uses state-of-art optimization strategies to reduce computational costs; hence, it is applicable to GWAS for thousands of phenotypes by large biobanks. Through the analysis of UK Biobank data of 408,961 samples from white British participants with European ancestry for > 1,400 binary phenotypes, we show that SAIGE can efficiently analyze large sample data, controlling for unbalanced case-control ratios and sample relatedness. SAIGE (Scalable and Accurate Implementation of GEneralized mixed model) is a generalized mixed model association test that can efficiently analyze large data sets while controlling for unbalanced case-control ratios and sample relatedness, as shown by applying SAIGE to the UK Biobank data for > 1,400 binary phenotypes.

Genetics

Rheumatology

1

Paper

Save

Biobank-driven genomic discovery yields new insight into atrial fibrillation biology

Jonas Nielsen et al.Jul 26, 2018

To identify genetic variation underlying atrial fibrillation, the most common cardiac arrhythmia, we performed a genome-wide association study of >1,000,000 people, including 60,620 atrial fibrillation cases and 970,216 controls. We identified 142 independent risk variants at 111 loci and prioritized 151 functional candidate genes likely to be involved in atrial fibrillation. Many of the identified risk variants fall near genes where more deleterious mutations have been reported to cause serious heart defects in humans (GATA4, MYH6, NKX2-5, PITX2, TBX5)1, or near genes important for striated muscle function and integrity (for example, CFL2, MYH7, PKP2, RBM20, SGCG, SSPN). Pathway and functional enrichment analyses also suggested that many of the putative atrial fibrillation genes act via cardiac structural remodeling, potentially in the form of an ‘atrial cardiomyopathy’2, either during fetal heart development or as a response to stress in the adult heart. Large-scale association analyses identify 142 independent risk variants for atrial fibrillation. Pathway and functional enrichment analyses suggest that many of the putative risk genes act via cardiac structural remodeling.

Genetics

Molecular Biology

0

Paper

Save

A genome-wide association study with 1,126,563 individuals identifies new risk loci for Alzheimer’s disease

Douglas Wightman et al.Sep 1, 2021

Late-onset Alzheimer’s disease is a prevalent age-related polygenic disease that accounts for 50–70% of dementia cases. Currently, only a fraction of the genetic variants underlying Alzheimer’s disease have been identified. Here we show that increased sample sizes allowed identification of seven previously unidentified genetic loci contributing to Alzheimer’s disease. This study highlights microglia, immune cells and protein catabolism as relevant to late-onset Alzheimer’s disease, while identifying and prioritizing previously unidentified genes of potential interest. We anticipate that these results can be included in larger meta-analyses of Alzheimer’s disease to identify further genetic variants that contribute to Alzheimer’s pathology. A genome-wide association study performed in 1,126,563 individuals identifies seven new loci associated with Alzheimer’s disease and implicates microglia and immune cells in late-onset disease.

Genetics

Immunology

0

Paper

Save

Comparative genetic architectures of schizophrenia in East Asian and European populations

Max Lam et al.Nov 18, 2019

Schizophrenia is a debilitating psychiatric disorder with approximately 1% lifetime risk globally. Large-scale schizophrenia genetic studies have reported primarily on European ancestry samples, potentially missing important biological insights. Here, we report the largest study to date of East Asian participants (22,778 schizophrenia cases and 35,362 controls), identifying 21 genome-wide-significant associations in 19 genetic loci. Common genetic variants that confer risk for schizophrenia have highly similar effects between East Asian and European ancestries (genetic correlation = 0.98 ± 0.03), indicating that the genetic basis of schizophrenia and its biology are broadly shared across populations. A fixed-effect meta-analysis including individuals from East Asian and European ancestries identified 208 significant associations in 176 genetic loci (53 novel). Trans-ancestry fine-mapping reduced the sets of candidate causal variants in 44 loci. Polygenic risk scores had reduced performance when transferred across ancestries, highlighting the importance of including sufficient samples of major ancestral groups to ensure their generalizability across populations.

Genetics

Biology

1

Paper

Save

Deciphering osteoarthritis genetics across 826,690 individuals from 9 populations

Cindy Boer et al.Aug 26, 2021

Osteoarthritis affects over 300 million people worldwide. Here, we conduct a genome-wide association study meta-analysis across 826,690 individuals (177,517 with osteoarthritis) and identify 100 independently associated risk variants across 11 osteoarthritis phenotypes, 52 of which have not been associated with the disease before. We report thumb and spine osteoarthritis risk variants and identify differences in genetic effects between weight-bearing and non-weight-bearing joints. We identify sex-specific and early age-at-onset osteoarthritis risk loci. We integrate functional genomics data from primary patient tissues (including articular cartilage, subchondral bone, and osteophytic cartilage) and identify high-confidence effector genes. We provide evidence for genetic correlation with phenotypes related to pain, the main disease symptom, and identify likely causal genes linked to neuronal processes. Our results provide insights into key molecular players in disease processes and highlight attractive drug targets to accelerate translation.

Genetics

Pharmacology

0

Paper

Save

Systematic evaluation of coding variation identifies a candidate causal variant in TM6SF2 influencing total cholesterol and myocardial infarction risk

Oddgeir Holmen et al.Mar 16, 2014

Genetics

Molecular Biology

0

Paper

Save

Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease

Wei Zhou et al.Oct 1, 2022

Genetics

Molecular Biology

0

Paper

Save

A second update on mapping the human genetic architecture of COVID-19

Masahiro Kanai et al.Sep 6, 2023

Investigating the role of host genetic factors in COVID-19 severity and susceptibility can inform our understanding of the underlying biological mechanisms that influence adverse outcomes and drug development 1,2 .Here we present a second updated genome-wide association study (GWAS) on COVID-19 severity and infection susceptibility to SARS-CoV-2 from the COVID-19 Host Genetic Initiative (data release 7).We performed a meta-analysis of up to 219,692 cases and over 3 million controls, identifying 51 distinct genome-wide significant loci-adding 28 loci from the previous data release 2 .The increased number of candidate genes at the identified loci helped to map three major biological pathways involved in susceptibility and severity: viral entry, airway defense in mucus, and type I interferon.

Genetics

Immunology

0

Paper

Save

An efficient and accurate frailty model approach for genome-wide survival association analysis controlling for population structure and relatedness in large-scale biobanks

Rounak Dey et al.Nov 1, 2020

Abstract With decades of electronic health records linked to genetic data, large biobanks provide unprecedented opportunities for systematically understanding the genetics of the natural history of complex diseases. Genome-wide survival association analysis can identify genetic variants associated with ages of onset, disease progression and lifespan. We developed an efficient and accurate frailty (random effects) model approach for genome-wide survival association analysis of censored time-to-event (TTE) phenotypes in large biobanks by accounting for both population structure and relatedness. Our method utilizes state-of-the-art optimization strategies to reduce the computational cost. The saddlepoint approximation is used to allow for analysis of heavily censored phenotypes (>90%) and low frequency variants (down to minor allele count 20). We demonstrated the performance of our method through extensive simulation studies and analysis of five TTE phenotypes, including lifespan, with heavy censoring rates (90.9% to 99.8%) on ~400,000 UK Biobank participants with white British ancestry and ~180,000 samples in FinnGen, respectively. We further performed genome-wide association analysis for 871 TTE phenotypes in UK Biobank and presented the genome-wide scale phenome-wide association (PheWAS) results with the PheWeb browser.

Genetics

Molecular Biology

12

Paper

Genetics

12

0

Save