Skip to content Skip to footer

Using AI and Machine Learning (ML) to Enhance Untargeted Metabolomics and Exposomics: Progress, Obstacles, and The Path Ahead

In recent years, advances in artificial intelligence (AI) and machine learning (ML) have greatly enhanced untargeted metabolomics, a field which allows for an unbiased global analysis of metabolites in the body and can yield crucial insights into human health and disease. Through high-resolution mass spectrometry, untargeted metabolomics identifies key metabolites and chemicals that may contribute to or indicate specific health conditions, linking environmental exposures with specific disease outcomes. The use of AI and ML applications has significantly improved the data quality, detection, and chemical identification processes, promoting major strides in disease screening and diagnosis.

The human body’s metabolic process creates various essential metabolites needed for energy and cell function. Metabolomics provides a detailed look into the body’s production of these crucial molecules, examining gene expression, protein function, and enzyme activity. While targeted metabolomics focuses on the measurement of specific metabolites, untargeted metabolomics offers a broader analysis of thousands of small molecules within the body. This approach, known as exposomics, takes into account a host of factors including environmental exposures, diet, lifestyle, and psychosocial factors, to demonstrate their collective impact on health.

When analyzing biological matrices such as serum, plasma, or urine, the workflow in untargeted metabolomics typically involves various stages of separation, detection, measurement, pre-and post-processing, data analysis, and chemical identification. However, the complex data produced in untargeted metabolomics is not easily deciphered. To manage this challenge, algorithms like XCMS, MZmine, and MS-Dial have been used to aid in pre-processing, and several machine learning tools including WiPP, MetaClean, Peakonly, NeatMS, NPFimg, and EVA have been developed to improve data processing accuracy.

Traditional models often struggle with the correlated structure of metabolomics data. In contrast, AI and ML methods build models directly on the data, thereby uncovering relationships between phenotypes, exposures, and diseases. These technologies have played a significant role in biomarker discovery, where the identification of metabolites is vital. Metabolite identification requires matching m/z and MS/MS fragmentation data to confirm metabolites using databases and spectral libraries. However, the matching rates for specialized chemicals need to be improved. Therefore, advances in tools like CSI: FingerID and CFM-ID that use ML and Natural Language Processing (NLP) are proving beneficial in improving identification accuracy.

Considering the vast potential of AI and ML in untargeted metabolomics and exposomics, it’s important to keep improving upon existing tools and algorithms, enhance metabolite identification, and further explore the wealth of information that can be obtained from the human exposome. Through these continual improvements, AI and ML are set to become even more powerful tools in disease detection, diagnosis, and our overall understanding of human health.

Leave a comment

0.0/5