Three genetic loci for lung cancer risk have been identified by genome-wide association studies (GWAS), but inherited susceptibility to specific histologic types of lung cancer is not well established. We conducted a GWAS of lung cancer and its major histologic types, genotyping 515,922 single-nucleotide polymorphisms (SNPs) in 5739 lung cancer cases and 5848 controls from one population-based case-control study and three cohort studies. Results were combined with summary data from ten additional studies, for a total of 13,300 cases and 19,666 controls of European descent. Four studies also provided histology data for replication, resulting in 3333 adenocarcinomas (AD), 2589 squamous cell carcinomas (SQ), and 1418 small cell carcinomas (SC). In analyses by histology, rs2736100 (TERT), on chromosome 5p15.33, was associated with risk of adenocarcinoma (odds ratio [OR] = 1.23, 95% confidence interval [CI] = 1.13–1.33, p = 3.02 × 10−7), but not with other histologic types (OR = 1.01, p = 0.84 and OR = 1.00, p = 0.93 for SQ and SC, respectively). This finding was confirmed in each replication study and overall meta-analysis (OR = 1.24, 95% CI = 1.17–1.31, p = 3.74 × 10−14 for AD; OR = 0.99, p = 0.69 and OR = 0.97, p = 0.48 for SQ and SC, respectively). Other previously reported association signals on 15q25 and 6p21 were also refined, but no additional loci reached genome-wide significance. In conclusion, a lung cancer GWAS identified a distinct hereditary contribution to adenocarcinoma. Three genetic loci for lung cancer risk have been identified by genome-wide association studies (GWAS), but inherited susceptibility to specific histologic types of lung cancer is not well established. We conducted a GWAS of lung cancer and its major histologic types, genotyping 515,922 single-nucleotide polymorphisms (SNPs) in 5739 lung cancer cases and 5848 controls from one population-based case-control study and three cohort studies. Results were combined with summary data from ten additional studies, for a total of 13,300 cases and 19,666 controls of European descent. Four studies also provided histology data for replication, resulting in 3333 adenocarcinomas (AD), 2589 squamous cell carcinomas (SQ), and 1418 small cell carcinomas (SC). In analyses by histology, rs2736100 (TERT), on chromosome 5p15.33, was associated with risk of adenocarcinoma (odds ratio [OR] = 1.23, 95% confidence interval [CI] = 1.13–1.33, p = 3.02 × 10−7), but not with other histologic types (OR = 1.01, p = 0.84 and OR = 1.00, p = 0.93 for SQ and SC, respectively). This finding was confirmed in each replication study and overall meta-analysis (OR = 1.24, 95% CI = 1.17–1.31, p = 3.74 × 10−14 for AD; OR = 0.99, p = 0.69 and OR = 0.97, p = 0.48 for SQ and SC, respectively). Other previously reported association signals on 15q25 and 6p21 were also refined, but no additional loci reached genome-wide significance. In conclusion, a lung cancer GWAS identified a distinct hereditary contribution to adenocarcinoma. Recently, three genome-wide association studies (GWAS) of lung cancer and subsequent pooled GWAS analyses identified inherited susceptibility variants on chromosome 15q25,1Hung R.J. McKay J.D. Gaborieau V. Boffetta P. Hashibe M. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. Rudnai P. et al.A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25.Nature. 2008; 452: 633-637Crossref PubMed Scopus (985) Google Scholar, 2Amos C.I. Wu X. Broderick P. Gorlov I.P. Gu J. Eisen T. Dong Q. Zhang Q. Gu X. Vijayakrishnan J. et al.Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1.Nat. Genet. 2008; 40: 616-622Crossref PubMed Scopus (979) Google Scholar, 3Thorgeirsson T.E. Geller F. Sulem P. Rafnar T. Wiste A. Magnusson K.P. Manolescu A. Thorleifsson G. Stefansson H. Ingason A. et al.A variant associated with nicotine dependence, lung cancer and peripheral arterial disease.Nature. 2008; 452: 638-642Crossref PubMed Scopus (1163) Google Scholar 5p15,4McKay J.D. Hung R.J. Gaborieau V. Boffetta P. Chabrier A. Byrnes G. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. et al.Lung cancer susceptibility locus at 5p15.33.Nat. Genet. 2008; 40: 1404-1406Crossref PubMed Scopus (450) Google Scholar, 5Wang Y. Broderick P. Webb E. Wu X. Vijayakrishnan J. Matakidou A. Qureshi M. Dong Q. Gu X. Chen W.V. et al.Common 5p15.33 and 6p21.33 variants influence lung cancer risk.Nat. Genet. 2008; 40: 1407-1409Crossref PubMed Scopus (440) Google Scholar, 6Rafnar T. Sulem P. Stacey S.N. Geller F. Gudmundsson J. Sigurdsson A. Jakobsdottir M. Helgadottir H. Thorlacius S. Aben K.K. et al.Sequence variants at the TERT-CLPTM1L locus associate with many cancer types.Nat. Genet. 2009; 41: 221-227Crossref PubMed Scopus (479) Google Scholar and 6p21.5Wang Y. Broderick P. Webb E. Wu X. Vijayakrishnan J. Matakidou A. Qureshi M. Dong Q. Gu X. Chen W.V. et al.Common 5p15.33 and 6p21.33 variants influence lung cancer risk.Nat. Genet. 2008; 40: 1407-1409Crossref PubMed Scopus (440) Google Scholar Lung cancer is classified into two main histologic groups: small cell lung cancer (SC) and non-small cell lung cancer; the latter includes adenocarcinoma (AD) and squamous cell carcinoma (SQ), along with rarer subtypes. Worldwide, adenocarcinoma is the most frequently identified histologic type, and the relative proportion of lung cancer due to this histology has steadily risen. Demographic, etiologic, clinical, and molecular characteristics of the lung cancer subtypes have been reported.7Gabrielson E. Worldwide trends in lung cancer pathology.Respirology. 2006; 11: 533-538Crossref PubMed Scopus (101) Google Scholar Although family history of lung cancer has been associated with histologic subtypes,8Gao Y. Goldstein A.M. Consonni D. Pesatori A.C. Wacholder S. Tucker M.A. Caporaso N.E. Goldin L. Landi M.T. Family history of cancer and nonmalignant lung diseases as risk factors for lung cancer.Int. J. Cancer. 2009; 125: 146-152Crossref PubMed Scopus (39) Google Scholar, 9Li X. Hemminki K. Inherited predisposition to early onset lung cancer according to histological type.Int. J. Cancer. 2004; 112: 451-457Crossref PubMed Scopus (55) Google Scholar, 10Ambrosone C.B. Rao U. Michalek A.M. Cummings K.M. Mettlin C.J. Lung cancer histologic types and family history of cancer. Analysis of histologic subtypes of 872 patients with primary lung cancer.Cancer. 1993; 72: 1192-1198Crossref PubMed Scopus (41) Google Scholar, 11Sellers T.A. Elston R.C. Atwood L.D. Rothschild H. Lung cancer histologic type and family history of cancer.Cancer. 1992; 69: 86-91Crossref PubMed Scopus (37) Google Scholar the inherited susceptibility factors that affect specific histologies are unknown. We conducted a GWAS in 5739 lung cancer cases and 5848 controls (National Cancer Institute [NCI] GWAS) to search for overall susceptibility variants and variants associated with specific histologic types and smoking status. We also conducted a meta-analysis of the NCI GWAS with summary data from ten additional studies, for a total of 13,300 primary lung cancer cases and 19,666 controls, all of European descent. Four of the ten studies provided information on histology for replication analyses; 3333 AD, 2589 SQ, and 1418 SC cases were analyzed overall. The 11,587 subjects in the NCI GWAS were drawn from one population-based case-control study and three cohort studies (Table 1); specifically: the Environment and Genetics in Lung Cancer Etiology (EAGLE),12Landi M.T. Consonni D. Rotunno M. Bergen A.W. Goldstein A.M. Lubin J.H. Goldin L. Alavanja M. Morgan G. Subar A.F. et al.Environment And Genetics in Lung cancer Etiology (EAGLE) study: an integrative population-based case-control study of lung cancer.BMC Public Health. 2008; 8: e203Crossref PubMed Scopus (91) Google Scholar a population-based case-control study including 2100 primary lung cancer cases and 2120 healthy controls enrolled in Italy between 2002 and 2005; the Alpha-Tocopherol, Beta-Carotene Cancer Prevention Study (ATBC),13The ATBC Cancer Prevention Study GroupThe alpha-tocopherol, beta-carotene lung cancer prevention study: design, methods, participant characteristics, and compliance.Ann. Epidemiol. 1994; 4: 1-10Abstract Full Text PDF PubMed Scopus (475) Google Scholar a randomized primary prevention trial including 29,133 male smokers enrolled in Finland between 1985 and 1993; the Prostate, Lung, Colon, Ovary Screening Trial (PLCO),14Hayes R.B. Sigurdson A. Moore L. Peters U. Huang W.Y. Pinsky P. Reding D. Gelmann E.P. Rothman N. Pfeiffer R.M. et al.Methods for etiologic and early marker investigations in the PLCO trial.Mutat. Res. 2005; 592: 147-154Crossref PubMed Scopus (114) Google Scholar a randomized trial including 150,000 individuals enrolled in ten U.S. study centers between 1992 and 2001; and the Cancer Prevention Study II Nutrition Cohort (CPS-II),15Calle E.E. Rodriguez C. Jacobs E.J. Almon M.L. Chao A. McCullough M.L. Feigelson H.S. Thun M.J. The American Cancer Society Cancer Prevention Study II Nutrition Cohort: rationale, study design, and baseline characteristics.Cancer. 2002; 94: 2490-2501Crossref PubMed Scopus (300) Google Scholar including over 183,000 subjects enrolled by the American Cancer Society between 1992 and 2001 across all U.S. states. Analyses stratified by histology in the NCI GWAS included 1730 AD cases, 1400 SQ cases, 678 SC cases, and groups of other histological types or of mixed histologies. These studies were approved by the individual institutional review boards of each location, and each subject gave his or her informed consent for participation.Table 1Studies Included in the Genome-wide Association Analysis of Lung CancerNo. of SubjectsStudyCasesControlsLocationStudy DesignIllumina HumanHap ChipsNCI GWASEAGLEaEnvironment and Genetics in Lung Cancer Etiology study.19201979ItalyPopulation-based case-control550K, 610QUADATBCbAlpha-Tocopherol, Beta-Carotene Cancer Prevention study.17321271FinlandCohort550K, 610QUADPLCOcProstate, Lung, Colon, Ovary screening trial.1390192410 US CentersCohort–Cancer Prevention Trial317K+240S, 550K, 610QUADCPS-IIdCancer Prevention Study II nutrition cohort.697674All US StatesCohort550K, 610QUAD, 1MTOTAL57395848Meta-AnalysisUK19781438UKHospital-based cases, birth cohort controls550KCentral Europe18372432Romania, Hungary, Slovakia, Poland, Russia, Checz Rep.Multicenter hospital-based case-control317K, 370DuoTexas11541137Texas, USAHospital-based case-control317KDeCODE Genetics7196030IcelandPopulation-based case-control317K, 370DuoHGF GermanyeHelmholtz-Gemeinschaft Deutscher Forschungszentren Lung Cancer GWAS.506480GermanyPopulation-based case-control (<50 years)550KCARETfCarotene and Retinol Efficacy Trial cohort.3973936 US CentersCancer Prevention Trial370DuoHUNT2/TromsogNorth Trondelag Health Study 2 / Tromsø IV.394382NorwayPopulation-based case-control370DuoCanada332505Greater Toronto areaHospital-based case-control317KFrance135146Paris and Caen areasHospital-based case-control370DuoEstonia109874EstoniaHospital-based case-control317K, 370DuoTOTAL756113818Grand Total1330019666a Environment and Genetics in Lung Cancer Etiology study.b Alpha-Tocopherol, Beta-Carotene Cancer Prevention study.c Prostate, Lung, Colon, Ovary screening trial.d Cancer Prevention Study II nutrition cohort.e Helmholtz-Gemeinschaft Deutscher Forschungszentren Lung Cancer GWAS.f Carotene and Retinol Efficacy Trial cohort.g North Trondelag Health Study 2 / Tromsø IV. Open table in a new tab The meta-analysis included all of the NCI GWAS data plus summary data from ten additional studies contributing 7561 cases and 13,818 controls (Table 1): (1) the UK study from the Institute for Cancer Research,5Wang Y. Broderick P. Webb E. Wu X. Vijayakrishnan J. Matakidou A. Qureshi M. Dong Q. Gu X. Chen W.V. et al.Common 5p15.33 and 6p21.33 variants influence lung cancer risk.Nat. Genet. 2008; 40: 1407-1409Crossref PubMed Scopus (440) Google Scholar including lung cancer cases from the Genetic Lung Cancer Predisposition Study established in 1999 and controls from the 1958 birth cohort;16Power C. Elliott J. Cohort profile: 1958 British birth cohort (National Child Development Study).Int. J. Epidemiol. 2006; 35: 34-41Crossref PubMed Scopus (628) Google Scholar (2) the International Agency for Research on Cancer (IARC) study in central Europe,1Hung R.J. McKay J.D. Gaborieau V. Boffetta P. Hashibe M. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. Rudnai P. et al.A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25.Nature. 2008; 452: 633-637Crossref PubMed Scopus (985) Google Scholar a hospital-based case-control study conducted in the Czech Republic, Hungary, Poland, Romania, Russia, and Slovakia between 1998 and 2002; (3) the Texas case-control study,2Amos C.I. Wu X. Broderick P. Gorlov I.P. Gu J. Eisen T. Dong Q. Zhang Q. Gu X. Vijayakrishnan J. et al.Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1.Nat. Genet. 2008; 40: 616-622Crossref PubMed Scopus (979) Google Scholar including cases newly diagnosed at the University of Texas M.D. Anderson Cancer Center since 1991 and controls from the Kelsey-Seybold clinics (the GWAS included only smokers and cases with non-small cell lung cancer); (4) the population-based case-control study from deCODE Genetics in Iceland,3Thorgeirsson T.E. Geller F. Sulem P. Rafnar T. Wiste A. Magnusson K.P. Manolescu A. Thorleifsson G. Stefansson H. Ingason A. et al.A variant associated with nicotine dependence, lung cancer and peripheral arterial disease.Nature. 2008; 452: 638-642Crossref PubMed Scopus (1163) Google Scholar including all Icelandic subjects originally recruited for different genetic studies between 1996 and 2007 at deCODE Genetics and lung cancer cases recruited from the Icelandic Cancer Registry since 1998; (5) the Helmholtz-Gemeinschaft Deutscher Forschungszentren (HGF) lung cancer GWA study,17Sauter W. Rosenberger A. Beckmann L. Kropp S. Mittelstrass K. Timofeeva M. Wolke G. Steinwachs A. Scheiner D. Meese E. et al.Matrix metalloproteinase 1 (MMP1) is associated with early-onset lung cancer.Cancer Epidemiol. Biomarkers Prev. 2008; 17: 1127-1135Crossref PubMed Scopus (102) Google Scholar including lung cancer cases diagnosed at ≤ 50 years from the LUng Cancer in the Young (LUCY) study, a multicenter study within 31 German hospitals, and the Heidelberg lung cancer study, a hospital-based case-control study conducted by the German Cancer Research Center (DKFZ) (controls were selected from the Cooperative Health Research in the Region of Augsburg [KORA]); (6) the Carotene and Retinol Efficacy Trial (CARET) cohort,18Omenn G.S. Goodman G. Thornquist M. Grizzle J. Rosenstock L. Barnhart S. Balmes J. Cherniack M.G. Cullen M.R. Glass A. et al.The beta-carotene and retinol efficacy trial (CARET) for chemoprevention of lung cancer in high risk populations: smokers and asbestos-exposed workers.Cancer Res. 1994; 54: 2038s-2043sPubMed Google Scholar including smokers with a smoking history of at least 20 pack-years enrolled in six U.S. centers between 1983 and 1994; (7) the HUNT2/Tromso study, including lung cancer cases and controls from the North Trondelag Health Study (HUNT 2),19Holmen J.M.K. Kruger O. Langhammer A. Lingaas Holmen T. Bratberg G.H. The Nord-Trøndelag Health Study 1995-97 (HUNT 2): Objectives, contents, methods and participation.Norweg. J. Epidemiol. 2003; 13: 19-32Google Scholar a population-based study conducted between 1995 and 1997 in North Trondelag County, and the Tromsø IV population-based study conducted in Tromsø County between 1994 and 1995; 8) the lung cancer study from Canada,1Hung R.J. McKay J.D. Gaborieau V. Boffetta P. Hashibe M. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. Rudnai P. et al.A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25.Nature. 2008; 452: 633-637Crossref PubMed Scopus (985) Google Scholar including lung cancer cases recruited at the University of Toronto and the Samuel Lunenfeld Research Institute between 1997 and 2002 and GWAS controls randomly selected from family medicine clinics; 9) the lung cancer study from France,20Feyler A. Voho A. Bouchardy C. Kuokkanen K. Dayer P. Hirvonen A. Benhamou S. Point: myeloperoxidase –463G–> a polymorphism and lung cancer risk.Cancer Epidemiol. Biomarkers Prev. 2002; 11: 1550-1554PubMed Google Scholar a hospital-based case-control study including smoking cases and controls recruited between 1988 and 1992 in ten French hospitals; and 10) the lung cancer study from Estonia, a hospital-based case-control study including lung cancer cases enrolled between 2002 and 2006 in Estonian hospitals and controls randomly selected from the Estonian Genome Project population-based cohort.21Nelis M. Esko T. Magi R. Zimprich F. Zimprich A. Toncheva D. Karachanak S. Pischakova T. Balascak I. Peltonen L. et al.Genetic structure of Europeans: a view from the North-East.PLoS ONE. 2009; 4: e5472Crossref PubMed Scopus (227) Google Scholar Three studies (the Texas,2Amos C.I. Wu X. Broderick P. Gorlov I.P. Gu J. Eisen T. Dong Q. Zhang Q. Gu X. Vijayakrishnan J. et al.Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1.Nat. Genet. 2008; 40: 616-622Crossref PubMed Scopus (979) Google Scholar deCODE,3Thorgeirsson T.E. Geller F. Sulem P. Rafnar T. Wiste A. Magnusson K.P. Manolescu A. Thorleifsson G. Stefansson H. Ingason A. et al.A variant associated with nicotine dependence, lung cancer and peripheral arterial disease.Nature. 2008; 452: 638-642Crossref PubMed Scopus (1163) Google Scholar and HGF German17Sauter W. Rosenberger A. Beckmann L. Kropp S. Mittelstrass K. Timofeeva M. Wolke G. Steinwachs A. Scheiner D. Meese E. et al.Matrix metalloproteinase 1 (MMP1) is associated with early-onset lung cancer.Cancer Epidemiol. Biomarkers Prev. 2008; 17: 1127-1135Crossref PubMed Scopus (102) Google Scholar studies) also contributed summary data from genome-wide scans stratified by histology, including 1138 AD, 578 SQ, and 210 SC cases. The UK study5Wang Y. Broderick P. Webb E. Wu X. Vijayakrishnan J. Matakidou A. Qureshi M. Dong Q. Gu X. Chen W.V. et al.Common 5p15.33 and 6p21.33 variants influence lung cancer risk.Nat. Genet. 2008; 40: 1407-1409Crossref PubMed Scopus (440) Google Scholar contributed data on the top single nucleotide polymorphisms (SNPs) of chromosome 5p15.33 by histology. These four studies contributed 1603 AD, 1189 SQ, and 740 SC cases to the meta-analysis by histology for this locus. In both the NCI GWAS and the studies in the meta-analysis, the lung cancer diagnosis was based on clinical criteria and confirmed by pathology reports from surgery, biopsy, or cytology samples in approximately 95% of cases and on clinical history and imaging for the remaining 5%. Tumor histology was coded according to the International Classification of Diseases for Oncology. In analyses stratified by histology, only adenocarcinoma, squamous cell carcinoma, and small cell carcinoma cases were included. All mixed subtypes or other histologies were excluded. Overall, between 10% and 50% of all diagnoses from the NCI GWAS were centrally reviewed by expert lung pathologists from NCI. The NCI GWAS scan was conducted at two institutions: the Center for Inherited Disease Research (CIDR), which genotyped all EAGLE and 1675 PLCO subjects, and the Core Genotyping Facility (CGF), NCI, which genotyped ATBC, CPS-II, and the remaining PLCO subjects. Controls from the Cancer Genetic Markers of Susceptibility (CGEMS) prostate cancer scan22Yeager M. Orr N. Hayes R.B. Jacobs K.B. Kraft P. Wacholder S. Minichiello M.J. Fearnhead P. Yu K. Chatterjee N. et al.Genome-wide association study of prostate cancer identifies a second risk locus at 8q24.Nat. Genet. 2007; 39: 645-649Crossref PubMed Scopus (908) Google Scholar were also included. EAGLE samples and 1675 PLCO samples were genotyped at CIDR, as part of the Gene Environment Association Studies Initiative (GENEVA) funded through the National Human Genome Research Institute, with the use of Illumina HumanHap550v3_B BeadChips (Illumina, San Diego, CA, USA). Data were released for 5620 of 5727 (98%) samples, including 32 blind duplicates (concordance was 99.993%); these were genotyped with 124 HapMap controls (66 CEU; 58 YRI). Allele cluster definitions per SNP were determined with the use of the Illumina BeadStudio Genotyping Module version 3.1.14 and the combined intensity data from 95% of the samples. The resulting cluster definitions were used on all samples. Genotypes were not called if the quality threshold (Gencall score) was below 0.15. Genotypes were released by CIDR for 560,505 (99.83% of attempted) SNPs. Genotypes were not released for SNPs not called by BeadStudio or for those with call rates less than 85%, more than one HapMap replicate error, more than a 3% (autosomal) or 5% (X chromosome) difference in call rate between genders, or more than 0.5% male AB frequency for the X chromosome. The mean non-Y chromosome SNP call rate and mean sample call rate were each 99.8% for the CIDR data set. Similar procedures were followed at CGF for the ATBC, CPSII, and PLCO cohorts with the use of three Illumina platforms: the HumanHap550K, the HumanHap610, and HumanHap 1 Million chips. All genotyped samples passed quality control metrics at CGF. After removal of assay and locus as a result of low completion rates, genotypes for each sample that appeared in duplicate were merged to form consensus genotypes for each subject. There were 12,111 study subjects available for subsequent analysis. Table S1, available online, shows the distribution of subjects by study and phenotype after application of quality control (QC) metrics. Figure S1 shows the cluster plot for the most notable SNP, rs2736100. A total of 221 pairs of samples were identified with >70% genotype concordance rate. Among them, 189 pairs were expected duplicates and had genotype concordance rates > 99.9%. There were 12 unexpected duplicates (cross or within studies) with >99.97% concordance rates. We evaluated the pairwise concordance on the basis of the entire set and observed 40 pairs of subjects with over 60% of concordant genotypes (genotype concordance > 60%). Exclusions are listed in Table S2. Deviations from Hardy-Weinberg proportions (HWP) were assessed in controls. Expected and observed p values were calculated with the use of the uniform distribution for all loci and the exact test, respectively. Autosomal SNPs with minor allele frequencies (MAFs) >5% and completion rates >95% were included. Deviation from HWP was minimal, and only loci with extremely low p values (p < 10−7) for each QC group were excluded from further analyses (Table S3). A quantile-quantile (Q-Q) plot of the p values per study is shown in Figure S2. To assess population structure, we estimated imputed continental ancestry by using the STRUCTURE program,23Pritchard J.K. Stephens M. Donnelly P. Inference of population structure using multilocus genotype data.Genetics. 2000; 155: 945-959PubMed Google Scholar with a set of 12,898 autosomal SNPs with low local background linkage disequilibrium (LD) (pairwise r2 < 0.004 measured in the population of European ancestry for any pair of SNPs less than 500 kb apart)24Yu K. Wang Z. Li Q. Wacholder S. Hunter D.J. Hoover R.N. Chanock S. Thomas G. Population substructure and control selection in genome-wide association studies.PLoS ONE. 2008; 3: e2551Crossref PubMed Scopus (98) Google Scholar (Figure S3). Genotypes from the three HapMap populations (Build 22 for HapMap II with MAF > 5%)25The International HapMap ProjectNature. 2003; 426: 789-796Crossref PubMed Scopus (4688) Google Scholar were used as reference populations. The number of inferred clusters (“K” parameter) was set to 3 for CEU, YRI, and JPT+CHB samples representing populations of European, African, and Asian origin, respectively. Eighteen subjects were detected as having less than 80% European ancestry and were excluded. Principal component analysis (PCA) for each study group (excluding subjects with less than 80% European ancestry, unexpected duplicates, and potential relative pairs) was performed with the same informative 12,898 SNPs with the use of the EIGENSTRAT program26Price A.L. Patterson N.J. Plenge R.M. Weinblatt M.E. Shadick N.A. Reich D. Principal components analysis corrects for stratification in genome-wide association studies.Nat. Genet. 2006; 38: 904-909Crossref PubMed Scopus (6171) Google Scholar (Figures S4A–S4D). After adjustment for significant principal components (PCs) in each study, comparison of observed and expected distributions showed no evidence for large-scale inflation of the association test statistics (inflation factor λ = 1.03, 103, 1.01, and 1.01 in EAGLE, PLCO, CPS-II, and ATBC, respectively), excluding the possibility of significant hidden population substructure. Q-Q plots for each NCI study are shown in Figures S5A–S5D. After excluding 183 subjects for the reasons described above (summarized in Table S2) and 337 subjects with incomplete phenotype data, we report analyses on 515,922 SNPs in 5739 lung cancer cases and 5848 controls (NCI GWAS, Table 1). Comparable QC procedures were conducted at each institution that provided summary results for the meta-analysis.1Hung R.J. McKay J.D. Gaborieau V. Boffetta P. Hashibe M. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. Rudnai P. et al.A susceptibility locus for lung cancer maps to nicotinic acetylcholine receptor subunit genes on 15q25.Nature. 2008; 452: 633-637Crossref PubMed Scopus (985) Google Scholar, 2Amos C.I. Wu X. Broderick P. Gorlov I.P. Gu J. Eisen T. Dong Q. Zhang Q. Gu X. Vijayakrishnan J. et al.Genome-wide association scan of tag SNPs identifies a susceptibility locus for lung cancer at 15q25.1.Nat. Genet. 2008; 40: 616-622Crossref PubMed Scopus (979) Google Scholar, 3Thorgeirsson T.E. Geller F. Sulem P. Rafnar T. Wiste A. Magnusson K.P. Manolescu A. Thorleifsson G. Stefansson H. Ingason A. et al.A variant associated with nicotine dependence, lung cancer and peripheral arterial disease.Nature. 2008; 452: 638-642Crossref PubMed Scopus (1163) Google Scholar, 4McKay J.D. Hung R.J. Gaborieau V. Boffetta P. Chabrier A. Byrnes G. Zaridze D. Mukeria A. Szeszenia-Dabrowska N. Lissowska J. et al.Lung cancer susceptibility locus at 5p15.33.Nat. Genet. 2008; 40: 1404-1406Crossref PubMed Scopus (450) Google Scholar, 5Wang Y. Broderick P. Webb E. Wu X. Vijayakrishnan J. Matakidou A. Qureshi M. Dong Q. Gu X. Chen W.V. et al.Common 5p15.33 and 6p21.33 variants influence lung cancer risk.Nat. Genet. 2008; 40: 1407-1409Crossref PubMed Scopus (440) Google Scholar For the genome-wide analysis of the NCI GWAS, we used unconditional logistic regression to derive a per-allele odds ratio (OR) and an associated 1 degree of freedom (df) association test adjusted for age in five-year intervals (defined as age at diagnosis or interview for the case-control study and as baseline age for cohort studies), gender, study (EAGLE, PLCO, ATBC, ACS), and four PCs for population stratification within studies (see description of PC analysis below). In additional analyses, we adjusted for smoking status (current, former, never), cigarettes smoked per day (≤ 10, 11–20, 21–30, 31–40, 41+), duration in 10 yr intervals, and number of years since quitting (1–5, 6–10, 11–20, 21–30, 30+) for former smokers (subjects who quit smoking at least 6 mo before participating in the study). The analyses with single and multiple SNPs stratified by histology, smoking status, and decade of birth were conducted with the use of the same models. Tests for interaction between a SNP (coded as a continuous variable) and smoking status or birth decade (coded with the use of dummy variables) were performed with Wald tests with the use of multiple dfs. For the meta-analysis with other studies, we obtained per-allele ORs and standard errors from each study. Because only summary data were available, we conducted the meta-analysis in two separate groups: “Set 1 SNPs” included a core of 279,698 SNPs that were available across all studies; and “Set 2 SNPs” included 197,647 SNPs that were available only for a subset of the studies that used the HumanHap500 or denser genomic platforms or provided summary data on imputed SNPs. We obtained meta-analysis estimates of per-allele ORs and associated p values by using the weighted Z-score method under a fixed effect model.27Higgins J.P. Thompson S.G. Quantifying heterogeneity in a meta-analysis.Stat. Med. 2002; 21: 1539-1558Crossref PubMed Scopus (18007) Google Scholar Tests for heterogeneity by study were performed with the use of the QE statistics, assuming a random effect model. For testing of heterogeneity across histologic subtypes, we reported the smallest p values obtained from pairwise case-case analyses between the subtypes after adjustment for multiple testing with the use of the Bonferroni correction. All odds-ratios were reported with respect to the minor allele in the pooled set of controls from all studies that contributed to the meta-analysis. For adjustment of population stratification, we used the same set of 12,898 autosomal informative SNPs24Yu K. Wang Z. Li Q. Wacholder S. Hunter D.J. Hoover R.N. Chanock S. Thomas G. Population substructure and control selection in genome-wide association studies.PLoS ONE. 2008; 3: e2551Crossref PubMed Scopus (98) Google Scholar used for QC. We conducted PCA in each of the four study groups (EAGLE, PLCO, ATBC, and CPS-II) separately.27Higgins J.P. Thompson S.G. Quantifying heterogeneity in a meta-analysis.Stat. Med. 2002; 21: 1539-1558Crossref PubMed Scopus (18007) Google Scholar For each study group, we identi