Abstract In natural products research, chemodiverse extracts coming from multiple organisms are explored for novel bioactive molecules, sometimes over extended periods. Samples are usually analyzed by liquid chromatography coupled with fragmentation mass spectrometry to acquire informative mass spectral ensembles. Such data is then exploited to establish relationships among analytes or samples (e.g. via molecular networking) and annotate metabolites. However, the comparison of samples profiled in different batches is challenging with current metabolomics methods. Indeed, the experimental variation - changes in chromatographical or mass spectrometric conditions - often hinders the direct comparison of the profiled samples. Here we introduce MEMO - M S2 Bas E d Sa M ple Vect O rization - a method allowing to cluster large amounts of chemodiverse samples based on their LC-MS/MS profiles in a retention time agnostic manner. This method is particularly suited for heterogeneous and chemodiverse sample sets. MEMO demonstrated similar clustering performance as state-of-the-art metrics taking into account fragmentation spectra. More importantly, such performance was achieved without the requirement of a prior feature alignment step and in a significantly shorter computational time. MEMO thus allows the comparison of vast ensembles of samples, even when analyzed over long periods of time, and on different chromatographic or mass spectrometry platforms. This new addition to the computational metabolomics toolbox should drastically expand the scope of large-scale comparative analysis. Abstract Figure
This paper's license is marked as closed access or non-commercial and cannot be viewed on ResearchHub. Visit the paper's external site.