A Novel Self-Attention Mechanism-Based Dynamic Ensemble Model for Soil Hyperspectral Prediction.
Keyang Yin, Jia Deng, Huixia Li, Chunhui Feng, Jie Peng
Abstract
Open AccessVisible-near-infrared spectroscopy enables rapid, non-destructive soil organic matter (SOM) detection, yet its prediction accuracy relies heavily on the effectiveness of the chosen algorithmic models. Weighted Averaging Ensemble Models (WAEM) are robust but face a key challenge: their performance depends on optimal base learner weight allocation, which standard evaluation indices often fail to achieve, risking biased weights and local optima. This study significantly enhances WAEM by determining optimal weights using information extracted from the model training process via seven methods, including reinforcement learning and a self-attention mechanism (Sam). Experiments on 704 soil samples from China's Tarim River Basin employed a dynamic data structure for real-time weight updating. Results show that six WAEM methods utilizing training process information outperformed conventional evaluation index approaches. Improvements reduced WAEM root mean square error (RMSE) by 0.028-1.279 g kg-1 and increased the correlation coefficient (R2) by up to 0.06. Sam achieved the highest performance, with R2 and RMSE reaching 0.927 and 2.325 g kg-1, respectively. Furthermore, model R2 began converging at 26 base learners, indicating diminishing returns from adding more. This research confirms that dynamic WAEM weight allocation via Sam significantly boosts SOM prediction accuracy, providing a scientific foundation for infrared-based soil monitoring.