Paper
Document
Submit new version
Download
Flag content
0

Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel

Authors
Olivier Delaneau,Jonathan Marchini
Gil McVean,Peter Donnelly,Gerton Lunter,Simon Myers,Anjali Hinch,Zamin Iqbal,Iain Mathieson,Andy Rimmer,Dionysia Xifara,Angeliki Kerasidou,Claire Churchhouse,David Altshuler,Stacey Gabriel,Eric Lander,Namrata Gupta,Mark Daly,Mark DePristo,Eric Banks,Gaurav Bhatia,Mauricio Carneiro,Guillermo Angel,Giulio Genovese,Robert Handsaker,Chris Hart,Steven McCarroll,James Nemesh,Ryan Poplin,S. Schaffner,Khalid Shakir,Pardis Sabeti,Sharon Grossman,Shervin Tabrizi,Ridhi Tariya,Heng Li,David Reich,Richard Durbin,Matthew Hurles,Senduran Balasubramaniam,John Burton,Petr Danecek,Thomas Keane,Anja Kolb-Kokocinski,Shane McCarthy,James Stalker,Michael Quail,Qasim Ayub,Yuan Chen,Alison Coffey,Vincenza Colonna,Ni Huang,Luke Jostins,Aylwyn Scally,Klaudia Walter,Yali Xue,Goo Jun,Ben Blackburne,Sarah Lindsay,Zemin Ning,Adam Frankish,Jennifer Harrow,Chris Tyler‐Smith,Gonalo Abecasis,Hyun Kang,Paul Anderson,Tom Blackwell,Fabio Busonero,Christian Fuchsberger,Andrea Maschio,Eleonora Porcu,Carlo Sidore,Adrian Tan,Mary Trost,David Bentley,Russell Grocock,Sean Humphray,Terena James,Zoya Kingsbury,Markus Bauer,R. Cheetham,Tony Cox,Michael Eberle,Lisa Murray,Richard Shaw,Aravinda Chakravarti,Andrew Clark,Alon Keinan,Juan Rodriguez‐Flores,Francisco Vega,Jeremiah Degenhardt,Evan Eichler,Paul Flicek,Laura Clarke,Rasko Leinonen,Richard Smith,Xiangqun Zheng-Bradley,Kathryn Beal,Fiona Cunningham,Javier Herrero,William McLaren,Graham Ritchie,Jonathan Barker,Gavin Kelman,Eugene Kulesha,Rajesh Radhakrishnan,Asier Roa,Dmitriy Smirnov,Ian Streeter,Iliana Toneva,Richard Gibbs,Huyen Dinh,Christie Kovar,Charles Lee,Lora Lewis,Donna Muzny,Jeff Reid,Min Wang,Fuli Yu,Matthew Bainbridge,Danny Challis,Uday Evani,James Lu,Uma Nagaswamy,Aniko Sabo,Yi Wang,Jin Yu,Gerald Fowler,Walker Hale,Dipak Kalra,Eric Green,Bartha Knoppers,Jan Korbel,Tobias Rausch,Adrian Sttz,Lauren Griffin,Chih-Heng Hsieh,Ryan Mills,Marcin Grotthuss,Chengsheng Zhang,Xinghua Shi,Hans Lehrach,Ralf Sudbrak,Vyacheslav Amstislavskiy,Matthias Lienhard,Florian Mertes,Marc Sultan,Bernd Timmermann,Marie‐Laure Yaspo,Sudbrak Sudbrak,Ralf Herwig,Elaine Mardis,Richard Wilson,Lucinda Fulton,Robert Fulton,George Weinstock,Asif Chinwalla,Jun Li,David Dooling,Daniel Koboldt,Michael McLellan,John Wallis,Michael Wendl,Qunyuan Zhang,Gábor Marth,Erik Garrison,Deniz Kural,Wan-Ping Lee,Wen Leong,Alistair Ward,Jiantao Wu,Mengyao Zhang,Deborah Nickerson,Can Alkan,Fereydoun Hormozdiari,Arthur Ko,Peter Sudmant,Jeanette Schmidt,Christopher Davies,Jeremy Gollub,Teresa Webster,Brant Wong,Yiping Zhan,Stephen Sherry,Chunlin Xiao,Deanna Church,Victor Ananiev,Zinaida Belaia,Dimitriy Beloslyudtsev,Nathan Bouk,Chao Chen,Robert Cohen,Charles Cook,John Garner,Timothy Hefferon,Mikhail Kimelman,Chunlei Liu,John Lopez,Peter Meric,Yuri Ostapchuk,Lon Phan,Sergiy Ponomarov,Valérie Schneider,Eugene Shekhtman,Karl Sirotkin,Douglas Slotta,Hua Zhang,Wei Wang,Xiaodong Fang,Xiaosen Guo,Min Jian,Hui Jiang,Xin Jin,Guoqing Li,Jingxiang Li,Yingrui Li,Xiao Liu,Yao Lu,Xuedi Ma,Shuaishuai Tai,Tang Meifang,Bo Wang,Guangbiao Wang,Honglong Wu,Renhua Wu,Ye Yin,Wenwei Zhang,Jiao Zhao,Meiru Zhao,Xiaole Zheng,Lachlan Coin,Lin Fang,Qibin Li,Zhenyu Li,Haoxiang Lin,Binghang Liu,Ruibang Luo,Haojing Shao,Bingqiang Wang,Yinlong Xie,Chen Ye,Chang Yu,Hancheng Zheng,Hongmei Zhu,Hongyu Cai,Hongzhi Cao,Yeyang Su,Zhongming Tian,Huanming Yang,Ling Yang,Jiayong Zhu,Zhiming Cai,Marcus Albrecht,Tatiana Borodina,Adam Auton,Seungtai Yoon,Jayon Lihm,Vladimir Makarov,Han‐Jun Jin,Wook Kim,Ki Kim,Srikanth Gottipati,D. Jones,D.N. Cooper,Edward Ball,Peter Stenson,Bret Barnes,Scott Kahn,Kai Ye,Mark Batzer,Miriam Konkel,Jerilyn Walker,Daniel MacArthur,Monkol Lek,Mark Shriver,Carlos Bustamante,Simon Gravel,Eimear Kenny,Jeffrey Kidd,Phil Lacroute,Brian Maples,Andrés Moreno‐Estrada,Fouad Zakharia,Brenna Henn,Karla Sandoval,Jake Byrnes,Eran Halperin,Yael Baran,David Craig,Alexis Christoforides,Tyler Izatt,Ahmet Kurdoglu,Shripad Sinari,Nils Homer,Kevin Squire,Jonathan Sebat,Vineet Bafna,Kenny Ye,Esteban Burchard,Ryan Hernandez,Christopher Gignoux,David Haussler,Sol Katzman,Wm. Kent,Bryan Howie,Andrés Ruiz‐Linares,Emmanouil Dermitzakis,Tuuli Lappalainen,Scott Devine,Xinyue Liu,Ankit Maroo,Luke Tallon,Jeffrey Rosenfeld,Leslie Michelson,Andrea Angius,Francesco Cucca,Serena Sanna,Abigail Bigham,Chris Jones,Fred Reinier,Yun Li,Robert Lyons,David Schlessinger,Philip Awadalla,Alan Hodgkinson,Tarás Oleksyk,Juan Martínez‐Cruzado,Yun‐Xin Fu,Xiaoming Liu,Momiao Xiong,Lynn Jorde,David Witherspoon,Jinchuan Xing,Brian Browning,Iman Hajirasouliha,Ken Chen,Cornelis Albers,Mark Gerstein,Alexej Abyzov,Jieming Chen,Yao Fu,Lukas Habegger,Arif Harmanci,Xinmeng Mu,Cristina Sisu,Suganthi Balasubramanian,Ekta Khurana,Declan Clarke,Jacob Michaelson,Chris O’Sullivan,Kathleen Barnes,Neda Gharani,Lorraine Toji,Norman Gerry,Jane Kaye,Alastair Kent,Rasika Mathias,Pilar Ossorio,Michael Parker,Charles Rotimi,Charmaine Royal,Sarah Tishkoff,Marc Vía,Walter Bodmer,Gabriel Bedoya,Gao Yang,Chu You,Andrés García‐Montero,Alberto Órfão,Julie Dutil,Adam Felsenfeld,Jean McEwen,Nicholas Clemm,Mark Guyer,Jane Peterson,Audrey Duncanson,Michael Dunn,Leena Peltonenz,David Green
+377 authors
,Juan Rodríguez-Flores
Published
Jun 13, 2014
Show more
Save
TipTip
Document
Submit new version
Download
Flag content
0
TipTip
Save
Document
Submit new version
Download
Flag content

Abstract

A major use of the 1000 Genomes Project (1000 GP) data is genotype imputation in genome-wide association studies (GWAS). Here we develop a method to estimate haplotypes from low-coverage sequencing data that can take advantage of single-nucleotide polymorphism (SNP) microarray genotypes on the same samples. First the SNP array data are phased to build a backbone (or 'scaffold') of haplotypes across each chromosome. We then phase the sequence data 'onto' this haplotype scaffold. This approach can take advantage of relatedness between sequenced and non-sequenced samples to improve accuracy. We use this method to create a new 1000 GP haplotype reference set for use by the human genetic community. Using a set of validation genotypes at SNP and bi-allelic indels we show that these haplotypes have lower genotype discordance and improved imputation performance into downstream GWAS samples, especially at low-frequency variants.

Paper PDF

This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.