Abstract Creating protein structure language has attracted increasing attention in unifing the modality of protein sequence and structure. While recent works, such as FoldToken1&2&3 have made great progress in this direction, the relationship between languages created by different models at different scales is still unclear. Moreover, models at multiple scales (different code space size, like 2 5 , 2 6 , ⋯, 2 12 ) need to be trained separately, leading to redundant efforts. We raise the question: Could a single model create multiscale fold languages? In this paper, we propose FoldToken4 to learn the consistent and hierarchical of multiscale fold languages. By introducing multiscale code adapters and token mixing techniques, FoldToken4 can generate multiscale languages from the same model, and discover the hierarchical token-mapping relationships across scales. To the best of our knowledge, FoldToken4 is the first effort to learn multi-scale token consistency and hierarchy in VQ research; Also, it should be more novel in protein structure language learning.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.