Machine learning prediction of rapid HBsAg seroclearance at week 24 in inactive carriers treated with pegylated interferon.
Jianxia Dong, Shan Ren, Pengxuan Wu, Haitian Yu, Xinyue Meng, Jing Zhao, Xiangyang Ye, Yan Huang, Zujiang Yu, Wenhua Zhang, Yilan Zeng, Xiaozhong Wang, Haibing Gao, Shuangsuo Dang, Jiabin Li
Abstract
Open AccessBACKGROUND AND AIM: To identify predictive factors for rapid hepatitis B surface antigen (HBsAg) seroclearance at week 24 in inactive HBsAg carriers (IHC) receiving pegylated interferon alpha-2b (Peg-IFN) therapy, and to develop a machine learning-based model to optimize individualized treatment strategies. METHODS: This retrospective analysis was based on a multicenter, prospective cohort study involving 2882 IHC patients treated with Peg-IFN and followed for at least 24 weeks. Predictive variables for week 24 HBsAg seroclearance were selected using both LASSO regression and the Boruta algorithm. Nine machine learning models were developed, including logistic regression (LR), decision tree (DT), and random forest (RF), with performance assessed via tenfold cross-validation. External validation was conducted in an independent cohort (n = 167) from three medical centers in Beijing. SHapley Additive Explanations (SHAP) were used to interpret model predictions and feature importance. RESULTS: The overall HBsAg seroclearance rate at week 24 was 18.7% (541/2,882). Key predictive factors included baseline HBsAg level, ≥ 1 log IU/mL decline in HBsAg at week 12, the ratio of alanine aminotransferase (ALT) to HBsAg at week 12, the ratio of week 12 ALT to baseline HBsAg, week 12 hepatitis B virus (HBV) DNA level, and week 12 hepatitis B surface antibody (HBsAb) level. The Light Gradient Boosting Machine (Light GBM) model demonstrated the best performance, achieving an area under the receiver operating characteristic curve (AUC) of 0.902 (95% CI 0.881-0.923) and a sensitivity of 0.889 in the training cohort, and an AUC of 0.917 (95% CI 0.850-0.983) with a sensitivity of 0.879 in the external validation cohort. SHAP analysis revealed that the week 12 ALT/ HBsAg ratio was the most impactful feature. CONCLUSIONS: We developed a LightGBM-based machine learning model that accurately predicts rapid HBsAg seroclearance at week 24 among IHC patients receiving Peg-IFN therapy. This model offers a valuable tool for early identification of rapid responders, personalized treatment planning, and potential discontinuation strategies. The individualized stopping rules derived from model-predicted probabilities provide an evidence-based approach to precision therapy in IHC patients.