ResearchHub | Open Science Community

Highly accurate protein structure prediction with AlphaFold

John Jumper et al.Jul 15, 2021

Abstract Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1–4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6,7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10–14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.

Genetics

Artificial Intelligence

10

Paper

Save

nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation

Fabian Isensee et al.Dec 7, 2020

Artificial Intelligence

Biophysics

0

Paper

Artificial Intelligence

4,344

0

Save

2

Highly accurate protein structure prediction for the human proteome

Kathryn Tunyasuvunakool et al.Jul 22, 2021

Abstract Protein structures can provide invaluable information, both for reasoning about biological processes and for enabling interventions such as structure-based drug development or targeted mutagenesis. After decades of effort, 17% of the total residues in human protein sequences are covered by an experimentally determined structure 1 . Here we markedly expand the structural coverage of the proteome by applying the state-of-the-art machine learning method, AlphaFold 2 , at a scale that covers almost the entire human proteome (98.5% of human proteins). The resulting dataset covers 58% of residues with a confident prediction, of which a subset (36% of all residues) have very high confidence. We introduce several metrics developed by building on the AlphaFold model and use them to interpret the dataset, identifying strong multi-domain predictions as well as regions that are likely to be disordered. Finally, we provide some case studies to illustrate how high-quality predictions could be used to generate biological hypotheses. We are making our predictions freely available to the community and anticipate that routine large-scale and high-accuracy structure prediction will become an important tool that will allow new questions to be addressed from a structural perspective.

Artificial Intelligence

Biochemistry

2

Paper

Artificial Intelligence

2,378

0

Save

0

The Liver Tumor Segmentation Benchmark (LiTS)

Patrick Bilic et al.Nov 17, 2022

In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in http://medicaldecathlon.com/. In addition, both data and online evaluation are accessible via https://competitions.codalab.org/competitions/17094.

Artificial Intelligence

Radiology, Nuclear Medicine And Imaging

0

Paper

Artificial Intelligence

551

0

Save

0

Applying and improving AlphaFold at CASP14

John Jumper et al.Oct 4, 2021

We describe the operation and improvement of AlphaFold, the system that was entered by the team AlphaFold2 to the "human" category in the 14th Critical Assessment of Protein Structure Prediction (CASP14). The AlphaFold system entered in CASP14 is entirely different to the one entered in CASP13. It used a novel end-to-end deep neural network trained to produce protein structures from amino acid sequence, multiple sequence alignments, and homologous proteins. In the assessors' ranking by summed z scores (>2.0), AlphaFold scored 244.0 compared to 90.8 by the next best group. The predictions made by AlphaFold had a median domain GDT_TS of 92.4; this is the first time that this level of average accuracy has been achieved during CASP, especially on the more difficult Free Modeling targets, and represents a significant improvement in the state of the art in protein structure prediction. We reported how AlphaFold was run as a human team during CASP14 and improved such that it now achieves an equivalent level of performance without intervention, opening the door to highly accurate large-scale structure prediction.

Artificial Intelligence

Biochemistry

0

Paper

Artificial Intelligence

300

0

Save

0

Classification of Cancer at Prostate MRI: Deep Learning versus Clinical PI-RADS Assessment

Patrick Schelb et al.Oct 8, 2019

Background Men suspected of having clinically significant prostate cancer (sPC) increasingly undergo prostate MRI. The potential of deep learning to provide diagnostic support for human interpretation requires further evaluation. Purpose To compare the performance of clinical assessment to a deep learning system optimized for segmentation trained with T2-weighted and diffusion MRI in the task of detection and segmentation of lesions suspicious for sPC. Materials and Methods In this retrospective study, T2-weighted and diffusion prostate MRI sequences from consecutive men examined with a single 3.0-T MRI system between 2015 and 2016 were manually segmented. Ground truth was provided by combined targeted and extended systematic MRI-transrectal US fusion biopsy, with sPC defined as International Society of Urological Pathology Gleason grade group greater than or equal to 2. By using split-sample validation, U-Net was internally validated on the training set (80% of the data) through cross validation and subsequently externally validated on the test set (20% of the data). U-Net-derived sPC probability maps were calibrated by matching sextant-based cross-validation performance to clinical performance of Prostate Imaging Reporting and Data System (PI-RADS). Performance of PI-RADS and U-Net were compared by using sensitivities, specificities, predictive values, and Dice coefficient. Results A total of 312 men (median age, 64 years; interquartile range [IQR], 58-71 years) were evaluated. The training set consisted of 250 men (median age, 64 years; IQR, 58-71 years) and the test set of 62 men (median age, 64 years; IQR, 60-69 years). In the test set, PI-RADS cutoffs greater than or equal to 3 versus cutoffs greater than or equal to 4 on a per-patient basis had sensitivity of 96% (25 of 26) versus 88% (23 of 26) at specificity of 22% (eight of 36) versus 50% (18 of 36). U-Net at probability thresholds of greater than or equal to 0.22 versus greater than or equal to 0.33 had sensitivity of 96% (25 of 26) versus 92% (24 of 26) (both P > .99) with specificity of 31% (11 of 36) versus 47% (17 of 36) (both P > .99), not statistically different from PI-RADS. Dice coefficients were 0.89 for prostate and 0.35 for MRI lesion segmentation. In the test set, coincidence of PI-RADS greater than or equal to 4 with U-Net lesions improved the positive predictive value from 48% (28 of 58) to 67% (24 of 36) for U-Net probability thresholds greater than or equal to 0.33 (P = .01), while the negative predictive value remained unchanged (83% [25 of 30] vs 83% [43 of 52]; P > .99). Conclusion U-Net trained with T2-weighted and diffusion MRI achieves similar performance to clinical Prostate Imaging Reporting and Data System assessment. © RSNA, 2019 Online supplemental material is available for this article. See also the editorial by Padhani and Turkbey in this issue.

Artificial Intelligence

Internal Medicine

0

Paper

Artificial Intelligence

258

0

Save