Nowadays computational methods in bioinformatics and cheminformatics have been widely used in molecular property prediction, advancing activities such as drug discovery. Combining to expert manual annotation of molecular features, machine learning approaches have gained satisfying prediction accuracies in most molecular property prediction tasks. Recently, Graph neural networks (GNNs) have gained increasing popularity in cheminformatics, where a chemical molecule structure is represented as a graph, and have made monumental progress in molecular property prediction. However, GNNs models requires large amounts of training samples, and the diversified molecular structure information might under-utilized when the model is trained with traditional random sampling strategies, thus leading to redundancy and inefficiency. Similar to human learning procedures, training of molecule graph learning models can benefit from an easy-to-difficult curriculum. In this study, we proposed a curriculum learning approach for graph based molecular property prediction, called CurrMG. A data-aware integrated difficulty measurer was proposed to distinguish easy molecules from complex ones. Without any model redesign or external data, our training strategy improves model efficiency and accuracy in numerous molecular property prediction tasks and shows potential for low data drug discovery.
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.