Recent advances in AI-based protein structure modeling have yielded remarkable progress in predicting protein structures. Since structures are constrained by their biological function, their geometry tends to evolve more slowly than the underlying amino acids sequences. This feature of structures could in principle be used to reconstruct phylogenetic trees over longer evolutionary timescales than sequence-based approaches, but until now a reliable structure-based tree building method has been elusive. Here, we demonstrate that the use of structure-based phylogenies can outperform sequence-based ones not only for distantly related proteins but also, remarkably, for more closely related ones. This is achieved by inferring trees from protein structures using a local structural alphabet, an approach robust to conformational changes that confound traditional structural distance measures. As an illustration, we used structures to decipher the evolutionary diversification of a particularly challenging family: the fast-evolving RRNPPA quorum sensing receptors enabling gram-positive bacteria, plasmids and bacteriophages to communicate and coordinate key behaviors such as sporulation, virulence, antibiotic resistance, conjugation or phage lysis/lysogeny decision. The advent of high-accuracy structural phylogenetics enables myriad of applications across biology, such as uncovering deeper evolutionary relationships, elucidating unknown protein functions, or refining the design of bioengineered molecules.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.