Paper
Document
Submit new version
Download
Flag content
1

Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores

Authors
Bjarni Vilhjálmsson,Jian Yang
Hilary Finucane,Alexander Gusev,Sara Lindström,Stephan Ripke,Giulio Genovese,Po Loh,Gaurav Bhatia,Ron Do,Tristan Hayeck,Hong Won,Benjamin Neale,Aiden Corvin,James Walters,Kai Farh,Peter Holmans,Phil Lee,Brendan Bulik-Sullivan,David Collier,Hailiang Huang,Tune Pers,Ingrid Agartz,Esben Agerbo,Margot Albus,Madeline Alexander,Farooq Amin,Silviu‐Alin Bacanu,Martin Begemann,Richard Belliveau,Judit Bene,Sarah Bergen,Elizabeth Bevilacqua,Tim Bigdeli,Donald Black,Richard Bruggeman,Nancy Buccola,Randy Buckner,William Byerley,Wiepke Cahn,Guiqing Cai,Dominique Campion,Rita Cantor,Vaughan Carr,Noa Carrera,Stanley Catts,Kimberly Chambert,Raymond Chan,Ronald Chen,Eric Chen,Wei Cheng,Eric Cheung,Siow Chong,C. Cloninger,David Cohen,Nadine Cohen,Paul Cormican,Nick Craddock,James Crowley,David Curtis,Michael Davidson,Kenneth Davis,Franziska Degenhardt,Jurgen Favero,Lynn DeLisi,Ditte Demontis,Dimitris Dikeos,Timothy Dinan,Srdjan Djurovic,Gary Donohoe,Elodie Drapeau,Jubao Duan,Frank Dudbridge,Naser Durmishi,Peter Eichhammer,Johan Eriksson,Valentina Escott‐Price,Laurent Essioux,Ayman Fanous,Martilias Farrell,Josef Frank,Lude Franke,Robert Freedman,Nelson Freimer,Marion Friedl,Joseph Friedman,Menachem Fromer,Lyudmila Georgieva,Elliot Gershon,Ina Giegling,Paola Giusti-Rodrguez,Stephanie Godard,Jacqueline Goldstein,V. Golimbet,Srihari Gopal,Jacob Gratten,Jakob Grove,Lieuwe Haan,Christian Hammer,Marian Hamshere,Jingmei Li,Po‐Ru Loh,Hong‐Hee Won,Sekar Kathiresan,Michele Pato,Carlos Pato,Rulla Tamimi,Eli Stahl,Noah Zaitlen,Bogdan Paşaniuc,Gillian Belbin,Eimear Kenny,Mikkel Schierup,Philip Jager,Nikolaos Patsopoulos,Steven McCarroll,Aarno Palotie,Shaun Purcell,Daniel Chasman,Michael Goddard,Peter Visscher,Peter Kraft,Hon‐Cheong So,Alkes Price,Kai‐How Farh,Brendan Bulik‐Sullivan,Elvira Bramon,M. Ikram,Carrie Bearden,Jurgen Del-Favero
+128 authors
,Robert McCarley
Published
Oct 1, 2015
Show more
Save
TipTip
Document
Submit new version
Download
Flag content
1
TipTip
Save
Document
Submit new version
Download
Flag content

Abstract

Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase. Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

Paper PDF

This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.