Multi-output deep learning for high-frequency prediction of air and surface temperature in Kuwait.
Shehroz S Khan, Rami Al-Hajj
Abstract
Open AccessAccurate prediction of air and surface temperature is essential for urban planning and climate resilience, especially in arid regions. This study evaluates the performance of multi-output regression models using high-frequency climate data collected every 5 min over four years in Kuwait. Thirty environmental variables (e.g., including humidity, solar radiation, dew point, and wind direction) were used to predict six air and surface temperature-related outcomes simultaneously. Ten models, including deep learning and traditional machine learning approaches, were benchmarked using a leave-1-year-out validation strategy. Results show that contextual embeddings-based Transformer (FTTransformer) and Long Short-Term Memory (LSTM) achieved strong predictive performance with an [Formula: see text] of 0.998, a mean squared error of 0.13, and a mean absolute error of 0.24 when forecasting six temperature variables at 5-min resolution. These results significantly outperform traditional machine learning models and demonstrate the robustness of deep learning approaches for high-frequency climate prediction. While deep learning models outperformed conventional methods, LSTM's performance degraded on anomalous data from previous years, whereas FTTransformer maintained stable accuracy across years. Model interpretation using SHAP and permutation importance identified key predictors for this task, underlining the significance of diverse climate features.