Computational and structural biotechnology journal

Explainable machine learning for preoperative relapse prediction in molecularly stratified endometrial cancer: A single-center finnish cohort study.

Sergio Vela Moreno, Masuma Khatun, Annukka Pasanen, Ralf Bützow, Andres Salumets, Mikko Loukovaara, Vijayachitra Modhukur

Published: 202610.1016/j.csbj.2025.12.018

Abstract

Open Access

Relapse risk in endometrial carcinoma (EC) is driven by molecular subtype, yet current WHO/ESGO classifications rely on postoperative data, limiting their preoperative use. We developed interpretable machine learning (ML) models to predict relapse timing (none, ≤6 months, >6 months) using exclusively preoperative multimodal data. In a single-center retrospective cohort of 784 EC patients, clinicopathological, molecular, immunohistochemical, and systemic biomarkers were integrated using four feature strategies: Traditional (clinicopathology), ESGO-based (guideline risk groups),TP53 + MMRd (high-risk biology), and POLE (low-risk). Random Forest (RF), Support Vector Machine, k-Nearest Neighbors, Gradient Boosting (GBM) models were trained with leakage-safe preprocessing and evaluated by area under the curve (AUC), accuracy, recall, and F1 score, with interpretability assessed by SHapley Additive exPlanations (SHAP). The RF-Traditional model achieved the best overall performance (F1 = 0.895, AUC = 0.840), while the GBM-POLE model achieved the highest sensitivity (F1 = 0.886, AUC = 0.842). However, prediction of Late Relapse remained challenging (F1 = 0.31) due to class rarity and heterogeneity. Key predictors included ARID1A loss, elevated CA125, thrombocytosis, and p16 expression among key predictors of relapse; while shared high-risk features across models were advanced stage, deep myometrial invasion, elevated CA125, and positive cytology. While multi-center validation is essential, our findings support biologically coherent predictions for individualized preoperative risk stratification, particularly for high-risk molecular subtypes.

View at DOI