Predicting the incidence of common intestinal infectious diseases in Changzhou, China based on environmental factors and deep learning.
Xianzhi Zheng, Lei Qiao, Hao Hong, Yixin Zhang, Qinhui Fan, Jinglan Dai, Jingyi Zhao, Fang Yao, Sipeng Shen
Abstract
Open AccessBACKGROUND: Intestinal infectious disease is a common infectious disease that is closely related to meteorological conditions and air pollution factors. We aim to construct a short-term prediction model for the daily incidence of common intestinal infectious diseases in Changzhou city. METHODS: The daily incidence data of hand, foot, and mouth disease and other infectious diarrhea in Changzhou and the daily meteorological data and air pollutant data in the same period were collected from May 13, 2014 to December 31, 2024. The meteorological data consisted of temperature, humidity, wind speed, air pressure, etc. Air pollutant data included PM2.5, PM10, SO2, NO2, O3, and AQI indicators. Three models, Long Short-Term Memory (LSTM), Transformer, and a hybrid model combining seasonal trend decomposition with Transformer, were constructed and compared. Additionally, an advanced STL-T-L hybrid model (Seasonal-Trend decomposition using Loess, Transformer, and LSTM) was proposed for further analysis. Bayesian optimization was used to determine the hyperparameters of the deep learning model. The model integrated historical incidence data, environmental factors, and engineered time characteristics, lag terms, and rolling statistics. The root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and mean absolute scale error (MASE) were calculated on the independent test set to evaluate the prediction performance of the model. RESULTS: Among all the evaluated models, the STL-T-L hybrid model showed the best prediction performance on the test set, with RMSE of 6.337, MAE of 4.524, MAPE of 58.482%, MASE of 0.638. The prediction model built in this study considered the historical incidence of the disease and incorporated various meteorological and air pollution factors. The results showed that the STL-T-L model incorporating these features achieved the best prediction results. CONCLUSION: The STL-T-L model can effectively predict the common intestinal infectious diseases and can be used as a tool for monitoring and early warning of intestinal infectious diseases in Changzhou.