Abstract Accurate annotation of coding regions in RNAs is essential for understanding gene translation. We developed a deep neural network to directly predict and analyze translation initiation and termination sites from RNA sequences. Trained with human transcripts, our model learned hidden rules of translation control and achieved a near perfect prediction of translation sites across entire transcriptome. Our model revealed a surprising role of codon usage in regulating translation termination, which was experimentally validated. We also identified thousands of new open reading frames in mRNAs or annotated lncRNAs, some of which were confirmed experimentally. Remarkably, the model trained with human mRNAs achieved high prediction accuracy in all eukaryotes and good prediction in polycistronic transcripts from prokaryotes or RNA viruses, suggesting a high degree of conservation in translation control. Collectively, this study presents a general and efficient deep learning model for RNA translation, providing new insights into the complexity of translation regulation.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.