ResearchHub | Open Science Community

A High-Coverage Genome Sequence from an Archaic Denisovan Individual

Matthias Meyer et al.Sep 1, 2012

Ancient Genomics The Denisovans were archaic humans closely related to Neandertals, whose populations overlapped with the ancestors of modern-day humans. Using a single-stranded library preparation method, Meyer et al. (p. 222 , published online 30 August) provide a detailed analysis of a high-quality Denisovan genome. The genomic sequence provides evidence for very low rates of heterozygosity in the Denisova, probably not because of recent inbreeding, but instead because of a small population size. The genome sequence also illuminates the relationships between humans and archaics, including Neandertals, and establishes a catalog of genetic changes within the human lineage.

Genetics

Paleontology

0

Paper

Save

Genes mirror geography within Europe

John Novembre et al.Aug 31, 2008

The power of the latest massively parallel synthetic DNA sequencing technologies is demonstrated in two major collaborations that shed light on the nature of genomic variation with ethnicity. The first describes the genomic characterization of an individual from the Yoruba ethnic group of west Africa. The second reports a personal genome of a Han Chinese, the group comprising 30% of the world's population. These new resources can now be used in conjunction with the Venter, Watson and NIH reference sequences. A separate study looked at genetic ethnicity on the continental scale, based on data from 1,387 individuals from more than 30 European countries. Overall there was little genetic variation between countries, but the differences that do exist correspond closely to the geographic map. Statistical analysis of the genome data places 50% of the individuals within 310 km of their reported origin. As well as its relevance for testing genetic ancestry, this work has implications for evaluating genome-wide association studies that link genes with diseases. Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations1,2,3,4,5. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing6; an individual’s DNA can be used to infer their geographic origin with surprising accuracy—often to within a few hundred kilometres.

Genetics

Demography

0

Paper

Save

Genome-Wide Survey of SNP Variation Uncovers the Genetic Structure of Cattle Breeds

Richard Gibbs et al.Apr 23, 2009

A survey of genetic diversity of cattle suggests two domestication events in Asia and selection by husbandry.

Genetics

Ecology

0

Paper

Save

The Genetic Ancestry of African Americans, Latinos, and European Americans across the United States

Katarzyna Bryc et al.Dec 20, 2014

Over the past 500 years, North America has been the site of ongoing mixing of Native Americans, European settlers, and Africans (brought largely by the trans-Atlantic slave trade), shaping the early history of what became the United States. We studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and show that the legacy of these historical interactions is visible in the genetic ancestry of present-day Americans. We document pervasive mixed ancestry and asymmetrical male and female ancestry contributions in all groups studied. We show that regional ancestry differences reflect historical events, such as early Spanish colonization, waves of immigration from many regions of Europe, and forced relocation of Native Americans within the US. This study sheds light on the fine-scale differences in ancestry within and across the United States and informs our understanding of the relationship between racial and ethnic identities and genetic ancestry.

Genetics

History

0

Paper

Save

Genome-wide patterns of population structure and admixture in West Africans and African Americans

Katarzyna Bryc et al.Dec 22, 2009

Quantifying patterns of population structure in Africans and African Americans illuminates the history of human populations and is critical for undertaking medical genomic studies on a global scale. To obtain a fine-scale genome-wide perspective of ancestry, we analyze Affymetrix GeneChip 500K genotype data from African Americans ( n = 365) and individuals with ancestry from West Africa ( n = 203 from 12 populations) and Europe ( n = 400 from 42 countries). We find that population structure within the West African sample reflects primarily language and secondarily geographical distance, echoing the Bantu expansion. Among African Americans, analysis of genomic admixture by a principal component-based approach indicates that the median proportion of European ancestry is 18.5% (25th–75th percentiles: 11.6–27.7%), with very large variation among individuals. In the African-American sample as a whole, few autosomal regions showed exceptionally high or low mean African ancestry, but the X chromosome showed elevated levels of African ancestry, consistent with a sex-biased pattern of gene flow with an excess of European male and African female ancestry. We also find that genomic profiles of individual African Americans afford personalized ancestry reconstructions differentiating ancient vs. recent European and African ancestry. Finally, patterns of genetic similarity among inferred African segments of African-American genomes and genomes of contemporary African populations included in this study suggest African ancestry is most similar to non-Bantu Niger-Kordofanian-speaking populations, consistent with historical documents of the African Diaspora and trans-Atlantic slave trade.

Genetics

Philosophy

0

Paper

Save

Genome-wide patterns of population structure and admixture among Hispanic/Latino populations

Katarzyna Bryc et al.May 5, 2010

Hispanic/Latino populations possess a complex genetic structure that reflects recent admixture among and potentially ancient substructure within Native American, European, and West African source populations. Here, we quantify genome-wide patterns of SNP and haplotype variation among 100 individuals with ancestry from Ecuador, Colombia, Puerto Rico, and the Dominican Republic genotyped on the Illumina 610-Quad arrays and 112 Mexicans genotyped on Affymetrix 500K platform. Intersecting these data with previously collected high-density SNP data from 4,305 individuals, we use principal component analysis and clustering methods FRAPPE and STRUCTURE to investigate genome-wide patterns of African, European, and Native American population structure within and among Hispanic/Latino populations. Comparing autosomal, X and Y chromosome, and mtDNA variation, we find evidence of a significant sex bias in admixture proportions consistent with disproportionate contribution of European male and Native American female ancestry to present-day populations. We also find that patterns of linkage-disequilibria in admixed Hispanic/Latino populations are largely affected by the admixture dynamics of the populations, with faster decay of LD in populations of higher African ancestry. Finally, using the locus-specific ancestry inference method LAMP , we reconstruct fine-scale chromosomal patterns of admixture. We document moderate power to differentiate among potential subcontinental source populations within the Native American, European, and African segments of the admixed Hispanic/Latino genomes. Our results suggest future genome-wide association scans in Hispanic/Latino populations may require correction for local genomic ancestry at a subcontinental scale when associating differences in the genome with disease risk, progression, and drug efficacy, as well as for admixture mapping.

Genetics

Molecular Biology

0

Paper

Save

The Population Reference Sample, POPRES: A Resource for Population, Disease, and Pharmacological Genetics Research

Matthew Nelson et al.Sep 1, 2008

Technological and scientific advances, stemming in large part from the Human Genome and HapMap projects, have made large-scale, genome-wide investigations feasible and cost effective. These advances have the potential to dramatically impact drug discovery and development by identifying genetic factors that contribute to variation in disease risk as well as drug pharmacokinetics, treatment efficacy, and adverse drug reactions. In spite of the technological advancements, successful application in biomedical research would be limited without access to suitable sample collections. To facilitate exploratory genetics research, we have assembled a DNA resource from a large number of subjects participating in multiple studies throughout the world. This growing resource was initially genotyped with a commercially available genome-wide 500,000 single-nucleotide polymorphism panel. This project includes nearly 6,000 subjects of African-American, East Asian, South Asian, Mexican, and European origin. Seven informative axes of variation identified via principal-component analysis (PCA) of these data confirm the overall integrity of the data and highlight important features of the genetic structure of diverse populations. The potential value of such extensively genotyped collections is illustrated by selection of genetically matched population controls in a genome-wide analysis of abacavir-associated hypersensitivity reaction. We find that matching based on country of origin, identity-by-state distance, and multidimensional PCA do similarly well to control the type I error rate. The genotype and demographic data from this reference sample are freely available through the NCBI database of Genotypes and Phenotypes (dbGaP). Technological and scientific advances, stemming in large part from the Human Genome and HapMap projects, have made large-scale, genome-wide investigations feasible and cost effective. These advances have the potential to dramatically impact drug discovery and development by identifying genetic factors that contribute to variation in disease risk as well as drug pharmacokinetics, treatment efficacy, and adverse drug reactions. In spite of the technological advancements, successful application in biomedical research would be limited without access to suitable sample collections. To facilitate exploratory genetics research, we have assembled a DNA resource from a large number of subjects participating in multiple studies throughout the world. This growing resource was initially genotyped with a commercially available genome-wide 500,000 single-nucleotide polymorphism panel. This project includes nearly 6,000 subjects of African-American, East Asian, South Asian, Mexican, and European origin. Seven informative axes of variation identified via principal-component analysis (PCA) of these data confirm the overall integrity of the data and highlight important features of the genetic structure of diverse populations. The potential value of such extensively genotyped collections is illustrated by selection of genetically matched population controls in a genome-wide analysis of abacavir-associated hypersensitivity reaction. We find that matching based on country of origin, identity-by-state distance, and multidimensional PCA do similarly well to control the type I error rate. The genotype and demographic data from this reference sample are freely available through the NCBI database of Genotypes and Phenotypes (dbGaP).

Genetics

Molecular Biology

0

Paper

Save

Fast and robust identity-by-descent inference with the templated positional Burrows-Wheeler transform

William Freyman et al.Sep 15, 2020

Abstract Estimating the genomic location and length of identical-by-descent (IBD) segments among individuals is a crucial step in many genetic analyses. However, the exponential growth in the size of biobank and direct-to-consumer (DTC) genetic data sets makes accurate IBD inference a significant computational challenge. Here we present the templated positional Burrows-Wheeler transform (TPBWT) to make fast IBD estimates robust to genotype and phasing errors. Using haplotype data simulated over pedigrees with realistic genotyping and phasing errors we show that the TPBWT outperforms other state-of-the-art IBD inference algorithms in terms of speed and accuracy. For each phase-aware method, we explore the false positive and false negative rates of inferring IBD by segment length and characterize the types of error commonly found. Our results highlight the fragility of most phased IBD inference methods; the accuracy of IBD estimates can be highly sensitive to the quality of haplotype phasing. Additionally we compare the performance of the TPBWT against a widely used phase-free IBD inference approach that is robust to phasing errors. We introduce both in-sample and out-of-sample TPBWT-based IBD inference algorithms and demonstrate their computational efficiency on massive-scale datasets with millions of samples. Furthermore we describe the binary file format for TPBWT-compressed haplotypes that results in fast and efficient out-of-sample IBD computes against very large cohort panels. Finally, we demonstrate the utility of the TPBWT in a brief empirical analysis exploring geographic patterns of haplotype sharing within Mexico. Hierarchical clustering of IBD shared across regions within Mexico reveals geographically structured haplotype sharing and a strong signal of isolation by distance. Our software implementation of the TPBWT is freely available for non-commercial use in the code repository https://github.com/23andMe/phasedibd .

Genetics

Artificial Intelligence

46

Paper

Save

The genetic ancestry of African, Latino, and European Americans across the United States.

Katarzyna Bryc et al.Sep 18, 2014

Over the past 500 years, North America has been the site of ongoing mixing of Native Americans, European settlers, and Africans brought largely by the Trans-Atlantic slave trade, shaping the early history of what became the United States. We studied the genetic ancestry of 5,269 self-described African Americans, 8,663 Latinos, and 148,789 European Americans who are 23andMe customers and show that the legacy of these historical interactions is visible in the genetic ancestry of present-day Americans. We document pervasive mixed ancestry and asymmetrical male and female ancestry contributions in all groups studied. We show that regional ancestry differences reflect historical events, such as early Spanish colonization, waves of immigration from many regions of Europe, and forced relocation of Native Americans within the US. This study sheds light on the fine-scale differences in ancestry within and across the United States, and informs our understanding of the relationship between racial and ethnic identities and genetic ancestry.

Genetics

History

0

Paper

Save

Genetic neurodevelopmental clustering and dyslexia

Austeja Ciulkinyte et al.Jul 15, 2024

Abstract Dyslexia is a learning difficulty with neurodevelopmental origins, manifesting as reduced accuracy and speed in reading and spelling. It is substantially heritable and frequently co-occurs with other neurodevelopmental conditions, particularly attention deficit-hyperactivity disorder (ADHD). Here, we investigate the genetic structure underlying dyslexia and a range of psychiatric traits using results from genome-wide association studies of dyslexia, ADHD, autism, anorexia nervosa, anxiety, bipolar disorder, major depressive disorder, obsessive compulsive disorder, schizophrenia, and Tourette syndrome. Genomic Structural Equation Modelling (GenomicSEM) showed heightened support for a model consisting of five correlated latent genomic factors described as: F1) compulsive disorders (including obsessive-compulsive disorder, anorexia nervosa, Tourette syndrome), F2) psychotic disorders (including bipolar disorder, schizophrenia), F3) internalising disorders (including anxiety disorder, major depressive disorder), F4) neurodevelopmental traits (including autism, ADHD), and F5) attention and learning difficulties (including ADHD, dyslexia). ADHD loaded more strongly on the attention and learning difficulties latent factor (F5) than on the neurodevelopmental traits latent factor (F4). The attention and learning difficulties latent factor (F5) was positively correlated with internalising disorders (.40), neurodevelopmental traits (.25) and psychotic disorders (.17) latent factors, and negatively correlated with the compulsive disorders (–.16) latent factor. These factor correlations are mirrored in genetic correlations observed between the attention and learning difficulties latent factor and other cognitive, psychological and wellbeing traits. We further investigated genetic variants underlying both dyslexia and ADHD, which implicated 49 loci (40 not previously found in GWAS of the individual traits) mapping to 174 genes (121 not found in GWAS of individual traits) as potential pleiotropic variants. Our study confirms the increased genetic relation between dyslexia and ADHD versus other psychiatric traits and uncovers novel pleiotropic variants affecting both traits. In future, analyses including additional co-occurring traits such as dyscalculia and dyspraxia will allow a clearer definition of the attention and learning difficulties latent factor, yielding further insights into factor structure and pleiotropic effects.

Genetics

Law

0

Paper

Genetics

Law

0

Save