Five-year dementia prediction and decision support system based on real-world data.
Themis P Exarchos, George A Dimakopoulos, Konstantinos Lazaros, Marios Krokidis, Aristidis Vrahatis, Gerasimos Grammenos, Antigoni Avramouli, Konstantina Skolariki, Roy Adams, Vasiliki Mahairaki, Esther S Oh, Jeannie Leoutsakos, Paul B Rosenberg, Constantine G Lyketsos, Panagiotis Vlamos
Abstract
Open AccessIntroduction: This work presents a machine learning (ML) based risk prediction model for Alzheimer's disease and related dementias, utilizing real-world electronic health record (EHR) clinical data. While significant research has been conducted on dementia risk prediction, most studies rely on volunteer-based research cohorts rather than real-world clinical data. Using raw EHR data offers more realistic insights but poses challenges due to the extensive effort required to convert real-world EHR clinical data into a decision support system for daily clinical use. Methods: The dataset consists of a high-volume, ten-year export of raw EHR data from Epic, the Johns Hopkins (JH) Health System. In this study, we utilized multimodal JH EHR data to develop a patient-based model to predict dementia onset over a five-year period. The interpretable binary classification model identified prognostic rulesets for dementia based on clinical characteristics. Results: The model achieved a mean test accuracy of 0.722 (95% CI: 0.722-0.723) and an AUROC of 0.795 (95% CI: 0.794-0.795) using 5-fold cross-validation across different sample subsets. Discussion: Recognizing that neurodegenerative diseases are often driven by multiple contributing factors rather than a single cause, we identify risk pathways by leveraging multimodal data and modeling their combined effects, leading to accurate dementia predictions and improved clinical interoperability.