Long-term Alzheimer's disease mortality prediction in adults aged ≥60 years: A prospective cohort study benchmarking survival machine learning algorithms.
Xiaoping Huang, Yue Xu, Ruitong Liao, Qingya Zhao, Xiaogang Lv, Qi Liu, Liuqing Li, Qianqian Ji, Dechao Tian, Yunzhang Wang, Yiqiang Zhan
Abstract
Open AccessINTRODUCTION: Accurate risk stratification for long-term Alzheimer's disease (AD)-specific mortality remains limited. METHODS: We analyzed data from 5,149 adults aged ≥60 years in NHANES III (1988-1994), with 116 baseline variables and mortality follow-up through 2019 via the National Death Index. Ten survival machine learning (ML) models were benchmarked. Predictive performance was assessed using Harrell's concordance index (C-index). RESULTS: Over a median follow-up of 12.1 years for survivors and 17.8 years for decedents, Lasso (C-index = 0.76, 95% CI: 0.72-0.80) and Extreme Gradient Boosting (C-index = 0.76, 95% CI: 0.73-0.79) achieved the highest accuracy. Feature importance analyses revealed novel predictors of AD mortality. Models using fewer than 20 variables retained acceptable performance (C-index > 0.70). CONCLUSION: Survival ML models effectively predict long-term AD-specific mortality using routine clinical data. Their interpretability, scalability, and capacity to identify novel risk factors support integration into geriatric risk assessment frameworks. Highlights: We benchmarked 10 survival machine learning (ML) algorithms using 116 clinical variables to predict long-term Alzheimer's disease (AD)-specific mortality.Feature importance analysis identified novel non-imaging clinical predictors, including arm circumference, self-rated physical activity, and alcohol consumption.This work highlights the underused potential of routine clinical data for AD mortality prediction and underscores the need for interpretable, population-based ML applications.