Paper
Document
Download
Flag content
84

Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2

84
TipTip
Save
Document
Download
Flag content

Abstract

Abstract The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly 5-fold enriched in conditionally folded IDRs over IDRs in general, and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs. Significance Statement AlphaFold2 and other machine learning-based methods can accurately predict the structures of most proteins. However, nearly two-thirds of human proteins contain segments that are highly flexible and do not autonomously fold, otherwise known as intrinsically disordered regions (IDRs). In general, IDRs interconvert rapidly between a large number of different conformations, posing a significant problem for protein structure prediction methods that define one or a small number of stable conformations. Here, we found that AlphaFold2 can readily identify structures for a subset of IDRs that fold under certain conditions (conditional folding). We leverage AlphaFold2’s predictions of conditionally folded IDRs to quantify the extent of conditional folding across the tree of life, and to rationalize disease-causing mutations in IDRs. Classifications : Biological Sciences; Biophysics and Computational Biology

Paper PDF

This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.