Abstract Nicotiana tabacum is a model organism in plant molecular and pathogenic research and has significant potential in the production of biofuels and active pharmaceutical compounds in synthetic biology. Because of the large allotetraploid genome of tobacco, its genomic features, genetic diversity and genetic regulation of many complex traits remain unknown. In this study, we present a nearly complete chromosome-scale assembly of N. tabacum and provide evidence that homoeologous exchange between subgenomes and epigenetic remodelling are likely mechanisms of genome stabilization and subgenome coordination following polyploidization. By leveraging GenBank-scale sequencing and phenotyping data from 5196 lines, geography at the continent scale, rather than types assigned on the basis of curing crop practices, was found to be the most important correlate of genetic structure. Using 178 markerâ–¡trait associations detected in genome-wide association analysis, a reference genotype-to-phenotype map was built for 39 morphological, developmental, and disease-resistance traits. A novel gene, auxin response factor 9 ( Arf9 ), associated with wider leaves after being knocked out, was fine-mapped to a single nucleotide polymorphism (SNP). This point mutation alters the translated amino acid from Ala 203 to Pro 203 , likely preventing homodimer formation during DNA binding. Our analysis also revealed signatures of positive and polygenic selection for multiple traits during the process of selective breeding. Overall, this study demonstrated the power of leveraging GenBank genomics to gain insights into the genomic features, genetic diversity, and regulation of complex traits in N. tabacum , laying a foundation for future research on plant functional genomics, crop breeding, and the production of biopharmaceuticals and biofuels.