Optimizing Machine Learning Models for Enhanced Prediction of Cardiometabolic Diseases from Multiomics Data

Optimizing Machine Learning Models for Enhanced Prediction of Cardiometabolic Diseases from Multiomics Data


Author(s): Eliana Ibrahimi,Nicholas Cauwenberghs,Tatiana Kouznetsova

Affiliation(s): Department of Biology, Faculty of Natural Sciences, University of Tirana



This study aims to optimize machine learning models for enhanced prediction of cardiometabolic diseases from multiomics data for an accurate and personalized risk assessment. By identifying and prioritizing key features and biomarkers from multiomics and clinical data, the study seeks to create models that offer an improved understanding of individual patient profiles, incorporating factors such as gut microbiome, metabolome, proteome, lifestyle, and clinical history. The methodology involves the integration of two benchmark datasets: one available from Fromentin et al, (2022), which includes information on gut microbiome, metabolome, and lifestyle, for 372 individuals with IHD, 275 healthy controls matched on age and gender, 222 untreated metabolically matched controls, and 372 controls matched with individuals with IHD in terms of type 2 diabetes status and body mass index; the other dataset available from Cauwenberghs et al, 2023, which consists of the proteomic profiling and ultrasonography of 491 community-dwelling participants. Several state-of-the-art machine and deep learning algorithms compatible with multiomics data and capable of handling high-dimensional features, such as fuzzy random forest, support vector machines, gradient boosting methods, and neural networks, are employed to analyze and extract meaningful patterns from the datasets. The integration of clinical data is used to further enhance the models, including incorporating traditional risk factors alongside molecular insights, to ensure model reliability, interpretability, and generalizability. Acknowledgements: This study is based upon work from COST Action AtheroNET, CA21153, supported by COST (European Cooperation in Science and Technology), www.cost.eu. References Cauwenberghs, N., Verheyen, A., Sabovčik, F., Ntalianis, E., Vanassche, T., Brguljan, J., & Kuznetsova, T. (2023). Serum proteomic profiling of carotid arteriopathy: A population out-come study. Atherosclerosis, 385, 117331. https://doi.org/10.1016/j.atherosclerosis.2023.117331 Fromentin, S., Forslund, S. K., Chechi, K., Chakaroun, et al. (2022). Microbiome and metabo-lome features of the cardiometabolic disease spectrum. Nature Medicine, 28(2), 303-314. https://doi.org/10.1038/s41591-022-01688-4