Machine-learning model for 30-day mortality in sepsis-associated delirium patients: A retrospective MIMIC-IV cohort study.
Jingjing Yin, Xuming Pan, Danlei Chen, Jiancheng Zhang, Guangjun Jin
Abstract
Open AccessPatients with sepsis in the intensive care unit (ICU) are particularly vulnerable to the onset of sepsis-associated delirium (SAD), which is associated with an increased mortality rate. This retrospective cohort study employed machine-learning algorithms to develop a risk-prediction model for 30-day mortality in ICU patients with SAD. Patients with SAD in ICU were extracted from the MIMIC-IV database. Patients were classified into 2 groups: those who survived and those who did not, based on 30-day mortality following ICU admission. The patient data included in this study were subsequently divided into the training and validation sets. The Boruta algorithm was used to identify significant feature indicators. Predictive models have been developed, including logistic regression, support vector machines, decision trees, random forests, extreme gradient boosting, k-nearest neighbors, and naive Bayes. The performance of these models was assessed using a validation set. The final machine-learning model incorporated the Shapley additive explanation method (SHAP) to enhance the interpretability of predictive outcomes. In total, 5390 patients were diagnosed with SAD using the MIMIC-IV database. The XGBoost model exhibited the highest predictive accuracy and was chosen as the final model, achieving an area under the receiver operating characteristic curve of 0.743 for the validation set. Using the SHAP method, the top 15 significant features were identified in the XGBoost predictive model. The SHAP analysis identified blood urea nitrogen, age, prothrombin time, partial thromboplastin time, and history of stroke as the top predictors of mortality. The XGBoost model demonstrated superior performance in forecasting 30-day mortality among ICU patients with SAD. In contrast to conventional predictive models, this machine-learning approach enables the prediction of 30-day mortality within 24 h of patient's admission. However, the model's low specificity may limit its clinical utility, and external validation is needed.