ResearchHub | Open Science Community

Within-sibship GWAS improve estimates of direct genetic effects

Laurence Howe et al.Mar 7, 2021

Abstract Estimates from genome-wide association studies (GWAS) represent a combination of the effect of inherited genetic variation (direct effects), demography (population stratification, assortative mating) and genetic nurture from relatives (indirect genetic effects). GWAS using family-based designs can control for demography and indirect genetic effects, but large-scale family datasets have been lacking. We combined data on 159,701 siblings from 17 cohorts to generate population (between-family) and within-sibship (within-family) estimates of genome-wide genetic associations for 25 phenotypes. We demonstrate that existing GWAS associations for height, educational attainment, smoking, depressive symptoms, age at first birth and cognitive ability overestimate direct effects. We show that estimates of SNP-heritability, genetic correlations and Mendelian randomization involving these phenotypes substantially differ when calculated using within-sibship estimates. For example, genetic correlations between educational attainment and height largely disappear. In contrast, analyses of most clinical phenotypes (e.g. LDL-cholesterol) were generally consistent between population and within-sibship models. We also report compelling evidence of polygenic adaptation on taller human height using within-sibship data. Large-scale family datasets provide new opportunities to quantify direct effects of genetic variation on human traits and diseases.

Genetics

Demography

200

Paper

Save

A Saturated Map of Common Genetic Variants Associated with Human Height from 5.4 Million Individuals of Diverse Ancestries

Loïc Yengo et al.Jan 10, 2022

ABSTRACT Common SNPs are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes. Here we show, using GWAS data from 5.4 million individuals of diverse ancestries, that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a median size of ~90 kb, covering ~21% of the genome. The density of independent associations varies across the genome and the regions of elevated density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs account for 40% of phenotypic variance in European ancestry populations but only ~10%-20% in other ancestries. Effect sizes, associated regions, and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely explained by linkage disequilibrium and allele frequency differences within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than needed to implicate causal genes and variants. Overall, this study, the largest GWAS to date, provides an unprecedented saturated map of specific genomic regions containing the vast majority of common height-associated variants.

Genetics

Biology

3

Paper

Save

Biological and clinical insights from genetics of insomnia symptoms

Jacqueline Lane et al.Feb 2, 2018

ABSTRACT Insomnia is a common disorder linked with adverse long-term medical and psychiatric outcomes, but underlying pathophysiological processes and causal relationships with disease are poorly understood. Here we identify 57 loci for self-reported insomnia symptoms in the UK Biobank (n=453,379) and confirm their impact on self-reported insomnia symptoms in the HUNT study (n=14,923 cases, 47,610 controls), physician diagnosed insomnia in Partners Biobank (n=2,217 cases, 14,240 controls), and accelerometer-derived measures of sleep efficiency and sleep duration in the UK Biobank (n=83,726). Our results suggest enrichment of genes involved in ubiquitin-mediated proteolysis, phototransduction and muscle development pathways and of genes expressed in multiple brain regions, skeletal muscle and adrenal gland. Evidence of shared genetic factors is found between frequent insomnia symptoms and restless legs syndrome, aging, cardio-metabolic, behavioral, psychiatric and reproductive traits. Evidence is found for a possible causal link between insomnia symptoms and coronary heart disease, depressive symptoms and subjective well-being. One Sentence Summary We identify 57 genomic regions associated with insomnia pointing to the involvement of phototransduction and ubiquitination and potential causal links to CAD and depression.

Genetics

Internal Medicine

0

Paper

Save

Rare coding variants in 35 genes associate with circulating lipid levels – a multi-ancestry analysis of 170,000 exomes

George Hindy et al.Dec 23, 2020

Abstract Large-scale gene sequencing studies for complex traits have the potential to identify causal genes with therapeutic implications. We performed gene-based association testing of blood lipid levels with rare (minor allele frequency<1%) predicted damaging coding variation using sequence data from >170,000 individuals from multiple ancestries: 97,493 European, 30,025 South Asian, 16,507 African, 16,440 Hispanic/Latino, 10,420 East Asian, and 1,182 Samoan. We identified 35 genes associated with circulating lipid levels. Ten of these: ALB , SRSF2 , JAK2, CREB3L3 , TMEM136 , VARS , NR1H3 , PLA2G12A , PPARG and STAB1 have not been implicated for lipid levels using rare coding variation in population-based samples. We prioritize 32 genes identified in array-based genome-wide association study (GWAS) loci based on gene-based associations, of which three: EVI5, SH2B3 , and PLIN1 , had no prior evidence of rare coding variant associations. Most of the associated genes showed evidence of association in multiple ancestries. Also, we observed an enrichment of gene-based associations for low-density lipoprotein cholesterol drug target genes, and for genes closest to GWAS index single nucleotide polymorphisms (SNP). Our results demonstrate that gene-based associations can be beneficial for drug target development and provide evidence that the gene closest to the array-based GWAS index SNP is often the functional gene for blood lipid levels.

Genetics

Surgery

1

Paper

Save

Identification of ACE2 modifiers by CRISPR screening

Emily Sherman et al.Jun 10, 2021

SARS-CoV-2 infection is initiated by binding of the viral spike protein to its receptor, ACE2, on the surface of host cells. ACE2 expression is heterogeneous both in vivo and in immortalized cell lines, but the molecular pathways that govern ACE2 expression remain unclear. We now report high-throughput CRISPR screens for functional modifiers of ACE2 surface abundance. We identified 35 genes whose disruption was associated with a change in the surface abundance of ACE2 in HuH7 cells. Enriched among these ACE2 regulators were established transcription factors, epigenetic regulators, and functional networks. We further characterized individual cell lines with disruption of SMAD4, EP300, PIAS1 , or BAMBI and found these genes to regulate ACE2 at the mRNA level and to influence cellular susceptibility to SARS-CoV-2 infection. Collectively, our findings clarify the host factors involved in SARS-CoV-2 entry and suggest potential targets for therapeutic development.

Genetics

Molecular Biology

1

Paper

Save

A framework for detecting noncoding rare variant associations of large-scale whole-genome sequencing studies

Zilin Li et al.Nov 8, 2021

Abstract Large-scale whole-genome sequencing studies have enabled analysis of noncoding rare variants’ (RVs) associations with complex human traits. Variant set analysis is a powerful approach to study RV association, and a key component of it is constructing RV sets for analysis. However, existing methods have limited ability to define analysis units in the noncoding genome. Furthermore, there is a lack of robust pipelines for comprehensive and scalable noncoding RV association analysis. Here we propose a computationally-efficient noncoding RV association-detection framework that uses STAAR (variant-set test for association using annotation information) to group noncoding variants in gene-centric analysis based on functional categories. We also propose SCANG (scan the genome)-STAAR, which uses dynamic window sizes and incorporates multiple functional annotations, in a non-gene-centric analysis. We furthermore develop STAARpipeline to perform flexible noncoding RV association analysis, including gene-centric analysis as well as fixed-window-based and dynamic-window-based non-gene-centric analysis. We apply STAARpipeline to identify noncoding RV sets associated with four quantitative lipid traits in 21,015 discovery samples from the Trans-Omics for Precision Medicine (TOPMed) program and replicate several noncoding RV associations in an additional 9,123 TOPMed samples.

Genetics

Molecular Biology

14

Paper

Save

A multi-layer functional genomic analysis to understand noncoding genetic variation in lipids

Shweta Ramdas et al.Dec 8, 2021

Abstract A major challenge of genome-wide association studies (GWAS) is to translate phenotypic associations into biological insights. Here, we integrate a large GWAS on blood lipids involving 1.6 million individuals from five ancestries with a wide array of functional genomic datasets to discover regulatory mechanisms underlying lipid associations. We first prioritize lipid-associated genes with expression quantitative trait locus (eQTL) colocalizations, and then add chromatin interaction data to narrow the search for functional genes. Polygenic enrichment analysis across 697 annotations from a host of tissues and cell types confirms the central role of the liver in lipid levels, and highlights the selective enrichment of adipose-specific chromatin marks in high-density lipoprotein cholesterol and triglycerides. Overlapping transcription factor (TF) binding sites with lipid-associated loci identifies TFs relevant in lipid biology. In addition, we present an integrative framework to prioritize causal variants at GWAS loci, producing a comprehensive list of candidate causal genes and variants with multiple layers of functional evidence. Two prioritized genes, CREBRF and RRBP1 , show convergent evidence across functional datasets supporting their roles in lipid biology.

Genetics

Molecular Biology

57

Paper

Save

Sex-specific and pleiotropic effects underlying kidney function identified from GWAS meta-analysis

Sarah Graham et al.Sep 19, 2018

Chronic Kidney Disease (CKD) is a growing health burden currently affecting 10-15% of adults worldwide. Estimated glomerular filtration rate (eGFR) as a marker of kidney function is commonly used to diagnose CKD. Previous genome-wide association study (GWAS) meta-analyses of CKD and eGFR or related phenotypes have identified a number of variants associated with kidney function, but these only explain a fraction of the variability in kidney phenotypes attributed to genetic components. To extend these studies, we analyzed data from the Nord-Trondelag Health Study (HUNT), which is more densely imputed than previous studies, and performed a GWAS meta-analysis of eGFR with publicly available summary statistics, more than doubling the sample size of previous meta-analyses. We identified 147 loci (53 novel loci) associated with eGFR, including genes involved in transcriptional regulation, kidney development, cellular signaling, metabolism, and solute transport. Moreover, genes at these loci show enriched expression in urogenital tissues and highlight gene sets known to play a role in kidney function. In addition, sex-stratified analysis identified three regions (prioritized genes: PPM1J, MCL1, and SLC47A1) with more significant effects in women than men. Using genetic risk scores constructed from these eGFR meta-analysis results, we show that associated variants are generally predictive of CKD but improve detection only modestly compared with other known clinical risk factors. Collectively, these results yield additional insight into the genetic factors underlying kidney function and progression to CKD.

Genetics

Internal Medicine

0

Paper

Save

Within-family studies for Mendelian randomization: avoiding dynastic, assortative mating, and population stratification biases

Ben Brumpton et al.Apr 9, 2019

Mendelian randomization (MR) is a widely-used method for causal inference using genetic data. Mendelian randomization studies of unrelated individuals may be susceptible to bias from family structure, for example, through dynastic effects which occur when parental genotypes directly affect offspring phenotypes. Here we describe methods for within-family Mendelian randomization and through simulations show that family-based methods can overcome bias due to dynastic effects. We illustrate these issues empirically using data from 61,008 siblings from the UK Biobank and Nord-Trøndelag Health Study. Both within-family and population-based Mendelian randomization analyses reproduced established effects of lower BMI reducing risk of diabetes and high blood pressure. However, while MR estimates from population-based samples of unrelated individuals suggested that taller height and lower BMI increase educational attainment, these effects largely disappeared in within-family MR analyses. We found differences between population-based and within-family based estimates, indicating the importance of controlling for family effects and population structure in Mendelian randomization studies.

Genetics

Demography

0

Paper

Save

Genome-scale CRISPR screening for modifiers of cellular LDL uptake

Brian Emmer et al.Jul 1, 2020

ABSTRACT Hypercholesterolemia is a causal and modifiable risk factor for atherosclerotic cardiovascular disease. A critical pathway regulating cholesterol homeostasis involves the receptor-mediated endocytosis of low-density lipoproteins into hepatocytes, mediated by the LDL receptor. We applied genome-scale CRISPR screening to query the genetic determinants of cellular LDL uptake in HuH7 cells cultured under either lipoprotein-rich or lipoprotein-starved conditions. Candidate LDL uptake regulators were validated through the synthesis and secondary screening of a customized library of gRNA at greater depth of coverage. This secondary screen yielded significantly improved performance relative to the primary genome-wide screen, with better discrimination of internal positive controls, no identification of negative controls, and improved concordance between screen hits at both the gene and gRNA level. We then applied our customized gRNA library to orthogonal screens that tested for the specificity of each candidate regulator for LDL versus transferrin endocytosis, the presence or absence of genetic epistasis with LDLR deletion, the impact of each perturbation on LDLR expression and trafficking, and the generalizability of LDL uptake modifiers across multiple cell types. These findings identified several previously unrecognized genes with putative roles in LDL uptake and suggest mechanisms for their functional interaction with LDLR.

Genetics

Immunology

13

Paper

Genetics

Immunology

0

Save