Assessing the accuracy of survival machine learning and traditional statistical models for Alzheimer's disease prediction over time: a study on the ADNI cohort.
Sardar Jahani, Ghodratollah Roshanaei, Leili Tapak, Alzheimer’s Disease Neuroimaging Initiative
Abstract
Open AccessBACKGROUND: Mild cognitive impairment (MCI) represents a transitional stage to Alzheimer's disease (AD), making progression prediction crucial for timely intervention. Predictive models integrating clinical, laboratory, and survival data can enhance early diagnosis and treatment decisions. While machine learning approaches effectively handle censored data, their application in MCI-to-AD progression prediction remains limited, with unclear superiority over classical survival models. METHODS: We analyzed 902 MCI individuals from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset with 61 baseline features. Traditional survival models (Cox proportional hazards, Weibull, elastic net Cox) were compared with machine learning techniques (gradient boosting survival, random survival forests [RSF]) for progression prediction. Models were evaluated using C-index and IBS. RESULTS: Following feature selection, 14 key features were retained for model training. RSF achieved superior predictive performance with the highest C-index (0.878, 95% CI: 0.877-0.879) and lowest IBS (0.115, 95% CI: 0.114-0.116), demonstrating statistically significant superiority over all evaluated models (P-value < 0.001). RSF demonstrated effective risk stratification across individual biomarker categories (genetic, imaging, cognitive) and achieved optimal patient separation into three distinct prognostic groups when combining all features (p < 0.0001). SHAP-based feature importance analysis of RSF revealed cognitive assessments as the most influential predictors, with Functional Activities Questionnaire (FAQ) achieving the highest importance score (1.098), followed by Logical Memory Delayed Recall Total (LDELTOTAL) (0.906) and Alzheimer's Disease Assessment Scale (ADAS13) (0.770). Among neuroimaging biomarkers, Fluorodeoxyglucose (FDG) emerged as the leading predictor (0.634), ranking fifth overall. Feature importance ranking differed between classical and machine learning approaches, with FDG maintaining consistent importance across all models. RSF demonstrated excellent predictive calibration with positive net benefit across risk thresholds from 0.2 to 0.8. CONCLUSIONS: The RSF model outperformed other methods, demonstrating superior potential for improving prognostic accuracy in medical diagnostics for MCI to AD progression.