Abstract The production of population-level trees using the genomic data of individuals is a fundamental task in the field of population genetics. Typically, these trees are produced using methods like hierarchical clustering, neighbor joining, or maximum likelihood. However, such methods are non-parametric: they require all data to be present at the time of tree formation, and the addition of new data points necessitates the regeneration of the entire tree, a potentially expensive process. They also do not easily integrate with larger workflows. In this study, we aim to address these problems by introducing parametric deep learning methods for tree formation from genotype data. Our models specifically create continuous representations of population trees in hyperbolic space, which has previously proven highly effective in embedding hierarchically structured data. We present two different architectures - a multi-layer perceptron (MLP) and a variational autoencoder (VAE) - and we analyze their performance using a variety of metrics along with comparisons to established tree-building methods. Both models tested produce embedding spaces that reflect human evolutionary history. In addition, we demonstrate the generalizability of these models by verifying that addition of new samples to an existing tree occurs in a semantically meaningful manner. Finally, we use Dasgupta’s Cost to compare the quality of trees generated by our models to those produced by established methods. Despite the fact that the benchmark methods are directly fit on the evaluation data, our models are able to outperform some of these and achieve highly comparable performance overall. Author summary Tree production is a vital task in population genetics, but current approaches fall prey to several common shortfalls. Most notably, they lack the ability to add new data points after tree generation, and they are often difficult to use in larger pipelines. By leveraging cutting-edge advances pairing deep learning with hyperbolic geometry, we develop multiple models designed to rectify these issues. Through experiments on a dataset of humans from globally widespread ancestries, we demonstrate the generalizability of our models to new data, and we also show strong empirical performance with respect to currently used methods. In addition, we show that the data representations produced by our models are semantically meaningful and reflect known facts about human evolutionary history. Finally, we discuss the additional benefits our models could provide, including improved visualization, greater privacy preservation, and improved integration with downstream machine learning tasks. In conclusion, we present models that are accurate, flexible, and generalizable, with the potential to facilitate a variety of further applications.