ResearchHub | Open Science Community

Robust relationship inference in genome-wide association studies

Ani Manichaikul et al.Oct 5, 2010

Abstract Motivation: Genome-wide association studies (GWASs) have been widely used to map loci contributing to variation in complex traits and risk of diseases in humans. Accurate specification of familial relationships is crucial for family-based GWAS, as well as in population-based GWAS with unknown (or unrecognized) family structure. The family structure in a GWAS should be routinely investigated using the SNP data prior to the analysis of population structure or phenotype. Existing algorithms for relationship inference have a major weakness of estimating allele frequencies at each SNP from the entire sample, under a strong assumption of homogeneous population structure. This assumption is often untenable. Results: Here, we present a rapid algorithm for relationship inference using high-throughput genotype data typical of GWAS that allows the presence of unknown population substructure. The relationship of any pair of individuals can be precisely inferred by robust estimation of their kinship coefficient, independent of sample composition or population structure (sample invariance). We present simulation experiments to demonstrate that the algorithm has sufficient power to provide reliable inference on millions of unrelated pairs and thousands of relative pairs (up to 3rd-degree relationships). Application of our robust algorithm to HapMap and GWAS datasets demonstrates that it performs properly even under extreme population stratification, while algorithms assuming a homogeneous population give systematically biased results. Our extremely efficient implementation performs relationship inference on millions of pairs of individuals in a matter of minutes, dozens of times faster than the most efficient existing algorithm known to us. Availability: Our robust relationship inference algorithm is implemented in a freely available software package, KING, available for download at http://people.virginia.edu/∼wc9c/KING. Contact: wmchen@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online.

Genetics

Artificial Intelligence

0

Paper

Save

Genome-Wide Association Scan Shows Genetic Variants in the FTO Gene Are Associated with Obesity-Related Traits

Angelo Tremblay et al.Jul 13, 2007

The obesity epidemic is responsible for a substantial economic burden in developed countries and is a major risk factor for type 2 diabetes and cardiovascular disease. The disease is the result not only of several environmental risk factors, but also of genetic predisposition. To take advantage of recent advances in gene-mapping technology, we executed a genome-wide association scan to identify genetic variants associated with obesity-related quantitative traits in the genetically isolated population of Sardinia. Initial analysis suggested that several SNPs in the FTO and PFKP genes were associated with increased BMI, hip circumference, and weight. Within the FTO gene, rs9930506 showed the strongest association with BMI (p = 8.6 ×10−7), hip circumference (p = 3.4 × 10−8), and weight (p = 9.1 × 10−7). In Sardinia, homozygotes for the rare “G” allele of this SNP (minor allele frequency = 0.46) were 1.3 BMI units heavier than homozygotes for the common “A” allele. Within the PFKP gene, rs6602024 showed very strong association with BMI (p = 4.9 × 10−6). Homozygotes for the rare “A” allele of this SNP (minor allele frequency = 0.12) were 1.8 BMI units heavier than homozygotes for the common “G” allele. To replicate our findings, we genotyped these two SNPs in the GenNet study. In European Americans (N = 1,496) and in Hispanic Americans (N = 839), we replicated significant association between rs9930506 in the FTO gene and BMI (p-value for meta-analysis of European American and Hispanic American follow-up samples, p = 0.001), weight (p = 0.001), and hip circumference (p = 0.0005). We did not replicate association between rs6602024 and obesity-related traits in the GenNet sample, although we found that in European Americans, Hispanic Americans, and African Americans, homozygotes for the rare “A” allele were, on average, 1.0–3.0 BMI units heavier than homozygotes for the more common “G” allele. In summary, we have completed a whole genome–association scan for three obesity-related quantitative traits and report that common genetic variants in the FTO gene are associated with substantial changes in BMI, hip circumference, and body weight. These changes could have a significant impact on the risk of obesity-related morbidity in the general population.

Genetics

Endocrinology

0

Paper

Save

Variants in MTNR1B influence fasting glucose levels

Inga Prokopenko et al.Dec 7, 2008

Gonçalo Abecasis and colleagues report associations with fasting plasma glucose levels in a collection of ten genome–wide association scans from the MAGIC consortium. They find variants in the gene encoding melatonin receptor 1B that are associated with fasting glucose levels and, in a meta-analysis of 13 case-control studies, also show association with increased risk of type 2 diabetes. To identify previously unknown genetic loci associated with fasting glucose concentrations, we examined the leading association signals in ten genome-wide association scans involving a total of 36,610 individuals of European descent. Variants in the gene encoding melatonin receptor 1B (MTNR1B) were consistently associated with fasting glucose across all ten studies. The strongest signal was observed at rs10830963, where each G allele (frequency 0.30 in HapMap CEU) was associated with an increase of 0.07 (95% CI = 0.06–0.08) mmol/l in fasting glucose levels (P = 3.2 × 10−50) and reduced beta-cell function as measured by homeostasis model assessment (HOMA-B, P = 1.1 × 10−15). The same allele was associated with an increased risk of type 2 diabetes (odds ratio = 1.09 (1.05–1.12), per G allele P = 3.3 × 10−7) in a meta-analysis of 13 case-control studies totaling 18,236 cases and 64,453 controls. Our analyses also confirm previous associations of fasting glucose with variants at the G6PC2 (rs560887, P = 1.1 × 10−57) and GCK (rs4607517, P = 1.0 × 10−25) loci.

Genetics

Internal Medicine

0

Paper

Save

Fine mapping of type 1 diabetes susceptibility loci and evidence for colocalization of causal variants with lymphoid gene enhancers

Suna Önengüt-Gümüşcü et al.Mar 9, 2015

Stephen Rich and colleagues report the discovery and fine mapping of type 1 diabetes susceptibility loci using the Immunochip. They also perform comparative analyses with 15 other immune disorders and find evidence of colocalization of causal variants with lymphoid gene enhancers. Genetic studies of type 1 diabetes (T1D) have identified 50 susceptibility regions1,2, finding major pathways contributing to risk3, with some loci shared across immune disorders4,5,6. To make genetic comparisons across autoimmune disorders as informative as possible, a dense genotyping array, the Immunochip, was developed, from which we identified four new T1D-associated regions (P < 5 × 10−8). A comparative analysis with 15 immune diseases showed that T1D is more similar genetically to other autoantibody-positive diseases, significantly most similar to juvenile idiopathic arthritis and significantly least similar to ulcerative colitis, and provided support for three additional new T1D risk loci. Using a Bayesian approach, we defined credible sets for the T1D-associated SNPs. The associated SNPs localized to enhancer sequences active in thymus, T and B cells, and CD34+ stem cells. Enhancer-promoter interactions can now be analyzed in these cell types to identify which particular genes and regulatory sequences are causal.

Genetics

Immunology

0

Paper

Save

Genome-wide association study shows BCL11A associated with persistent fetal hemoglobin and amelioration of the phenotype of β-thalassemia

Manuela Uda et al.Feb 2, 2008

β-Thalassemia and sickle cell disease both display a great deal of phenotypic heterogeneity, despite being generally thought of as simple Mendelian diseases. The reasons for this are not well understood, although the level of fetal hemoglobin (HbF) is one well characterized ameliorating factor in both of these conditions. To better understand the genetic basis of this heterogeneity, we carried out genome-wide scans with 362,129 common SNPs on 4,305 Sardinians to look for genetic linkage and association with HbF levels, as well as other red blood cell-related traits. Among major variants affecting HbF levels, SNP rs11886868 in the BCL11A gene was strongly associated with this trait ( P < 10 −35 ). The C allele frequency was significantly higher in Sardinian individuals with elevated HbF levels, detected by screening for β-thalassemia, and patients with attenuated forms of β-thalassemia vs. those with thalassemia major. We also show that the same BCL11A variant is strongly associated with HbF levels in a large cohort of sickle cell patients. These results indicate that BCL11A variants, by modulating HbF levels, act as an important ameliorating factor of the β-thalassemia phenotype, and it is likely they could help ameliorate other hemoglobin disorders. We expect our findings will help to characterize the molecular mechanisms of fetal globin regulation and could eventually contribute to the development of new therapeutic approaches for β-thalassemia and sickle cell anemia.

Genetics

Immunology

0

Paper

Save

Imputing Amino Acid Polymorphisms in Human Leukocyte Antigens

Xiaoming Jia et al.Jun 6, 2013

DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes.

Genetics

Immunology

0

Paper

Save

Heritability of Cardiovascular and Personality Traits in 6,148 Sardinians

Giuseppe Pilia et al.Aug 23, 2006

In family studies, phenotypic similarities between relatives yield information on the overall contribution of genes to trait variation. Large samples are important for these family studies, especially when comparing heritability between subgroups such as young and old, or males and females. We recruited a cohort of 6,148 participants, aged 14–102 y, from four clustered towns in Sardinia. The cohort includes 34,469 relative pairs. To extract genetic information, we implemented software for variance components heritability analysis, designed to handle large pedigrees, analyze multiple traits simultaneously, and model heterogeneity. Here, we report heritability analyses for 98 quantitative traits, focusing on facets of personality and cardiovascular function. We also summarize results of bivariate analyses for all pairs of traits and of heterogeneity analyses for each trait. We found a significant genetic component for every trait. On average, genetic effects explained 40% of the variance for 38 blood tests, 51% for five anthropometric measures, 25% for 20 measures of cardiovascular function, and 19% for 35 personality traits. Four traits showed significant evidence for an X-linked component. Bivariate analyses suggested overlapping genetic determinants for many traits, including multiple personality facets and several traits related to the metabolic syndrome; but we found no evidence for shared genetic determinants that might underlie the reported association of some personality traits and cardiovascular risk factors. Models allowing for heterogeneity suggested that, in this cohort, the genetic variance was typically larger in females and in younger individuals, but interesting exceptions were observed. For example, narrow heritability of blood pressure was approximately 26% in individuals more than 42 y old, but only approximately 8% in younger individuals. Despite the heterogeneity in effect sizes, the same loci appear to contribute to variance in young and old, and in males and females. In summary, we find significant evidence for heritability of many medically important traits, including cardiovascular function and personality. Evidence for heterogeneity by age and sex suggests that models allowing for these differences will be important in mapping quantitative traits.

Genetics

Internal Medicine

0

Paper

Save

Family-Based Association Tests for Genomewide Association Scans

Wei‐Min Chen et al.Oct 19, 2007

Genetics

Molecular Biology

0

Paper

Save

Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis

Anne Hinks et al.Apr 21, 2013

Anne Hinks and colleagues identify 14 new susceptibility loci for juvenile idiopathic arthritis through targeted analyses of genomic regions implicated in immune function. Their study implicates several pathways, including IL-2 signaling, in the pathogenesis of this common childhood autoimmune disease. We used the Immunochip array to analyze 2,816 individuals with juvenile idiopathic arthritis (JIA), comprising the most common subtypes (oligoarticular and rheumatoid factor–negative polyarticular JIA), and 13,056 controls. We confirmed association of 3 known JIA risk loci (the human leukocyte antigen (HLA) region, PTPN22 and PTPN2) and identified 14 loci reaching genome-wide significance (P < 5 × 10−8) for the first time. Eleven additional new regions showed suggestive evidence of association with JIA (P < 1 × 10−6). Dense mapping of loci along with bioinformatics analysis refined the associations to one gene in each of eight regions, highlighting crucial pathways, including the interleukin (IL)-2 pathway, in JIA disease pathogenesis. The entire Immunochip content, the HLA region and the top 27 loci (P < 1 × 10−6) explain an estimated 18, 13 and 6% of the risk of JIA, respectively. In summary, this is the largest collection of JIA cases investigated so far and provides new insight into the genetic basis of this childhood autoimmune disease.

Genetics

Immunology

0

Paper

Save

Avoiding dynastic, assortative mating, and population stratification biases in Mendelian randomization through within-family analyses

Ben Brumpton et al.Jul 14, 2020

Abstract Estimates from Mendelian randomization studies of unrelated individuals can be biased due to uncontrolled confounding from familial effects. Here we describe methods for within-family Mendelian randomization analyses and use simulation studies to show that family-based analyses can reduce such biases. We illustrate empirically how familial effects can affect estimates using data from 61,008 siblings from the Nord-Trøndelag Health Study and UK Biobank and replicated our findings using 222,368 siblings from 23andMe. Both Mendelian randomization estimates using unrelated individuals and within family methods reproduced established effects of lower BMI reducing risk of diabetes and high blood pressure. However, while Mendelian randomization estimates from samples of unrelated individuals suggested that taller height and lower BMI increase educational attainment, these effects were strongly attenuated in within-family Mendelian randomization analyses. Our findings indicate the necessity of controlling for population structure and familial effects in Mendelian randomization studies.

Genetics

Internal Medicine

0

Paper

Genetics

296

0

Save