ResearchHub | Open Science Community

Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array

Rosalind Eeles et al.Mar 27, 2013

Rosalind Eeles and colleagues report meta-analysis of genome-wide association studies for prostate cancer and genotyping on the custom iCOGS array in 25,074 cases and 24,272 controls from 32 studies available in the PRACTICAL Consortium. They identify 23 new prostate cancer susceptibility loci, 20 of which are associated with both aggressive and non-aggressive disease. Prostate cancer is the most frequently diagnosed cancer in males in developed countries. To identify common prostate cancer susceptibility alleles, we genotyped 211,155 SNPs on a custom Illumina array (iCOGS) in blood DNA from 25,074 prostate cancer cases and 24,272 controls from the international PRACTICAL Consortium. Twenty-three new prostate cancer susceptibility loci were identified at genome-wide significance (P < 5 × 10−8). More than 70 prostate cancer susceptibility loci, explaining ∼30% of the familial risk for this disease, have now been identified. On the basis of combined risks conferred by the new and previously known risk loci, the top 1% of the risk distribution has a 4.7-fold higher risk than the average of the population being profiled. These results will facilitate population risk stratification for clinical studies.

Genetics

Oncology

0

Paper

Save

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

Gleb Kichaev et al.Oct 30, 2014

Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.

Genetics

Artificial Intelligence

0

Paper

Save

Breast Cancer Risk From Modifiable and Nonmodifiable Risk Factors Among White Women in the United States

Paige Maas et al.May 26, 2016

0

Paper

Save

Genome-wide association analysis of venous thromboembolism identifies new risk loci and genetic overlap with arterial vascular disease

Derek Klarin et al.Nov 1, 2019

Genetics

Internal Medicine

0

Paper

Save

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

Bjarni Vilhjálmsson et al.Mar 2, 2015

Polygenic risk scores have shown great promise in predicting complex disease risk, and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves LD-pruning markers and applying a P-value threshold to association statistics, but this discards information and may reduce predictive accuracy. We introduce a new method, LDpred, which infers the posterior mean causal effect size of each marker using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the pruning/thresholding approach, particularly at large sample sizes. Accordingly, prediction R2 increased from 20.1% to 25.3% in a large schizophrenia data set and from 9.8% to 12.0% in a large multiple sclerosis data set. A similar relative improvement in accuracy was observed for three additional large disease data sets and when predicting in non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

Genetics

Artificial Intelligence

0

Paper

Save

Ancestry-specific maps of GRCh38 linkage disequilibrium blocks for human genome research

James MacDonald et al.Mar 7, 2022

Abstract A map of approximately independent linkage disequilibrium (LD) blocks has many uses in statistical genetics. Current publicly available LD block maps are based on sparse recombination maps and are only available for GRCh37 (hg19) and prior genome assemblies. We generated LD blocks in GRCh38 coordinates for African (AFR), East Asian (EAS), European (EUR) and South Asian (SAS) ancestry populations. These new maps consist of 1,143 (EAS) - 1,604 (AFR) independent LD blocks across the 22 autosomal chromosomes and can be accessed at https://github.com/jmacdon/LDblocks_GRCh38 .

Genetics

Molecular Biology

10

Paper

Save

StocSum: stochastic summary statistics for whole genome sequencing studies

Nannan Wang et al.Apr 6, 2023

Genomic summary statistics, usually defined as single-variant test results from genome-wide association studies, have been widely used to advance the genetics field in a wide range of applications. Applications that involve multiple genetic variants also require their correlations or linkage disequilibrium (LD) information, often obtained from an external reference panel. In practice, it is usually difficult to find suitable external reference panels that represent the LD structure for underrepresented and admixed populations, or rare genetic variants from whole genome sequencing (WGS) studies, limiting the scope of applications for genomic summary statistics. Here we introduce StocSum, a novel reference-panel-free statistical framework for generating, managing, and analyzing stochastic summary statistics using random vectors. We develop various downstream applications using StocSum including single-variant tests, conditional association tests, gene-environment interaction tests, variant set tests, as well as meta-analysis and LD score regression tools. We demonstrate the accuracy and computational efficiency of StocSum using two cohorts from the Trans-Omics for Precision Medicine Program. StocSum will facilitate sharing and utilization of genomic summary statistics from WGS studies, especially for underrepresented and admixed populations.

Genetics

Molecular Biology

1

Paper

Save

A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts

Sara Lindström et al.Oct 25, 2016

The Nurses′ Health Study (NHS), Nurses′ Health Study II (NHSII), Health Professionals Follow Up Study (HPFS) and the Physicians Health Study (PHS) have collected detailed longitudinal data on multiple exposures and traits for approximately 310,000 study participants over the last 35 years. Over 160,000 study participants across the cohorts have donated a DNA sample and to date, 20,691 subjects have been genotyped as part of genome-wide association studies (GWAS) of twelve primary outcomes. However, these studies utilized six different GWAS arrays making it difficult to conduct analyses of secondary phenotypes or share controls across studies. To allow for secondary analyses of these data, we have created three new datasets merged by platform family and performed imputation using a common reference panel, the 1,000 Genomes Phase I release. Here, we describe the methodology behind the data merging and imputation and present imputation quality statistics and association results from two GWAS of secondary phenotypes (body mass index (BMI) and venous thromboembolism (VTE)). We observed the strongest BMI association for the FTO SNP rs55872725 (β=0.45, p=3.48x10-22), and using a significance level of p=0.05, we replicated 19 out of 32 known BMI SNPs. For VTE, we observed the strongest association for the rs2040445 SNP (OR=2.17, 95% CI: 1.79-2.63, p=2.70x10-15), located downstream of F5 and also observed significant associations for the known ABO and F11 regions. This pooled resource can be used to maximize power in GWAS of phenotypes collected across the cohorts and for studying gene-environment interactions as well as rare phenotypes and genotypes.

Genetics

Epidemiology

0

Paper

Save

NON-ADDITIVE EFFECTS OF COMMON GENETIC VARIANTS HAVE A NEGLIGENT CONTRIBUTION TO CANCER HERITABILITY

Austin Suger et al.Jul 17, 2024

Abstract Background: The contribution of dominance effects to cancer heritability is unknown. We leveraged existing genome-wide association data for seven cancers to estimate the contribution of dominance effects to the heritability of individual cancer types. Methods: We estimated the proportion of phenotypic variation due to dominance genetic effects using genome-wide association data for seven cancers (breast, colorectal, lung, melanoma, non-melanoma skin, ovarian, and prostate) in a total of 166,772 cases and 284,824 controls. Results: We observed no evidence of a meaningful contribution of dominance effects to cancer heritability. In contrast, additive effects ranged between 0.11 and 0.34. Conclusions: In line with studies of other human traits, dominance effects of common genetic variants play a minimal role in cancer etiology. Impact: These results support the assumption of an additive inheritance model when conducting cancer association studies with common genetic variants.

Genetics

Biology

0

Paper

Save

Partitioning heritability by functional category using GWAS summary statistics

Hilary Finucane et al.Jan 23, 2015

Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here, we analyze a broad set of functional elements, including cell-type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits spanning a total of 1.3 million phenotype measurements. To enable this analysis, we introduce a new method for partitioning heritability from GWAS summary statistics while controlling for linked markers. This new method is computationally tractable at very large sample sizes, and leverages genome-wide information. Our results include a large enrichment of heritability in conserved regions across many traits; a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers; and many cell-type-specific enrichments including significant enrichment of central nervous system cell types in body mass index, age at menarche, educational attainment, and smoking behavior. These results demonstrate that GWAS can aid in understanding the biological basis of disease and provide direction for functional follow-up.

Genetics

Molecular Biology

0

Paper

Genetics

Molecular Biology

0

Save

Identification of 23 new prostate cancer susceptibility loci using the iCOGS custom genotyping array

Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

Breast Cancer Risk From Modifiable and Nonmodifiable Risk Factors Among White Women in the United States

Importance

Objective

Design, Setting, and Participants

Exposures

Main Outcomes and Measures

Results

Conclusions and Relevance

Genome-wide association analysis of venous thromboembolism identifies new risk loci and genetic overlap with arterial vascular disease

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

Ancestry-specific maps of GRCh38 linkage disequilibrium blocks for human genome research

StocSum: stochastic summary statistics for whole genome sequencing studies

A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts

NON-ADDITIVE EFFECTS OF COMMON GENETIC VARIANTS HAVE A NEGLIGENT CONTRIBUTION TO CANCER HERITABILITY

Partitioning heritability by functional category using GWAS summary statistics