Genome wide association studies (GWAS) have identified more than 200 mostly new common low-penetrance susceptibility loci for cancers. The predicted risk associated with each locus is generally modest (with a per-allele odds ratio typically less than 2) and so, presumably, are the functional effects of individual genetic variants conferring disease susceptibility. Perhaps the greatest challenge in the ‘post-GWAS’ era is to understand the functional consequences of these loci. Biological insights can then be translated to clinical benefits, including reliable biomarkers and effective strategies for screening and disease prevention. The purpose of this article is to propose principles for the initial functional characterization of cancer risk loci, with a focus on non-coding variants, and to define ‘post-GWAS’ functional characterization. By December 2010, there were 1,212 published GWAS studies1 reporting significant (P < 5 × 10−8) associations for 210 traits (Table 1), and the Catalog of Published GWAS states that by March 2011, 812 publications reported 3,977 SNP associations1. This is likely a small fraction of the common susceptibility loci of low penetrance that will eventually be identified. Despite these successes in identifying risk loci, the causal variant and/or the molecular basis of risk etiology has been determined for only a small fraction of these associations2–4. Plausible candidate genes can be based on proximity to risk loci, but few have so far been defined in a more systematic manner (Supplementary Table 1). Table 1 The genomic context in which a variant is found can be used as preliminary functional analysis Increased investment in post-GWAS functional characterization of risk loci5 has now been advocated across diseases and for cardiovascular disease and diabetes6. For cancer biology, the complex interplay between genetics and the environment in many cancers poses a particularly exciting challenge for post-GWAS research. Here we suggest a systematic strategy for understanding how cancer-associated variants exert their effects. We mostly refer to SNPs throughout the paper, but we recognize that other types of common genetic (for example, copy number variants) or epigenetic variation may influence risk. Our understanding of the way in which a risk variant initiates disease pathogenesis progresses from statistical association between genetic variation and trait or disease variation to functionality and causality. The functional consequences of variants in protein-coding regions causing most monogenic disorders are more readily interpreted because we know the genetic code. For non-Mendelian or multifactorial traits, most of the common DNA variants have so far mapped to non-protein–coding regions2, where our understanding of functional consequences and causality is more rudimentary. Our hypothesis is that the trait-associated alleles exert their effects by influencing transcriptional output (such as transcript levels and splicing) through multiple mechanisms. We emphasize appropriate assays and models to test the functional effects of both SNPs and genes mapping to cancer predisposition loci. Although much of what is written is applicable to alleles discovered for any trait, the section on modeling gene effects will emphasize measuring cancer-related phenotypes. At some loci, multiple, independently associated risk alleles rather than single risk alleles may be functionally responsible for the occurrence of disease. Genotyping susceptibility loci (and their correlated variants) in multiple populations with different linkage disequilibrium (LD) structures may prove effective in substantially reducing the number of potentially causative variants (that is, the same causal variant may segregate in multiple populations), as shown for the FGFR2 locus in breast cancer7, but for most loci there will remain a set of potentially causative variants that cannot be separated at the statistical level from case-control genotype data. A susceptibility locus should be re-sequenced to ascertain all genetic variation, identifying candidate functional or causal variants and identifying candidate causal genes. Ideally, the identification of a causal SNP would be the next step to reveal the molecular mechanisms of risk modification. Practically, however, it is unclear what the criteria for causality should be, particularly in non-protein–coding regions. Thus, although we propose a framework set of analyses (Box 1), we acknowledge that the techniques and methods will continue to evolve with the field. Box 1 Strategies to progress from tag SNP to mechanism Target resequencing efforts using linkage disequilibrium (LD) structure. Use other populations to refine LD regions (for example African ancestry with shorter LD and more heterogeneity). Determine expression levels of nearby genes as a function of genotype at each locus (eQTL). Characterize gene regulatory regions by multiple empirical techniques bearing in mind that these are tissue and context specific. Combine regulatory regions with risk loci using coordinates from multiple reference genomes to capture all variation within the shorter regulatory regions that correlates with the tag SNP at each locus. Multiple experimental manipulations in model systems are needed to progressively implicate transcription units (genes) in mechanisms relevant to the associated loci: Knockouts of regulatory regions in animal (difficult and may be limited by functional redundancy, but new targeting methods in rat are promising) models followed by genome-wide expression analysis. Use chromatin association methods (3C, CHIA-PET) of regulatory regions to determine the identity of target genes (compare with eQTL data). Targeted gene perturbations in somatic cell models. Explore fully genome-wide eQTL and miRNA quantitative variation correlation in relevant tissues and cells. Explore epigenetic mechanisms in the context of genome-wide genetic polymorphism. Employ cell models and tissue reconstructions to evaluate mechanisms using gene perturbations and polymorphic variants. The human cancer cell xenograft has re-emerged as a minimal in vivo validation of these models. Above all, resist the temptation to equate any partial functional evidence as sufficient. Published claims of functional relevance should be fully evaluated using the steps detailed above.