Irradiance dataset in the south of Colombia from 2013 to 2023 in 5-minutes intervals.
John Barco-Jiménez, Daniel Rosero, Andrés Zambrano, Francisco Eraso-Checa, Miller Ruales, José Camilo Eraso
Abstract
Open AccessThis article presents an extensive irradiance dataset collected in San Juan de Pasto, located in southern Colombia, using a Davis Vantage PRO 2 meteorological station. The dataset spans 11 years, covering the period from 2013 to 2023, with measurements taken at 5-minute intervals, resulting in approximately 603,495 irradiance records, each accompanied by a corresponding timestamp. The construction of the dataset required a rigorous preprocessing stage. This stage included the removal of erroneous values (NaN) and outliers, the identification of missing entries, and the correction of inconsistencies in the date records. Missing values were addressed through gap-filling procedures based on averaged data, complemented by visual inspections using graphical representations. The cleaned dataset was exported after ensuring data integrity, accuracy, and consistency, which are essential for reliable analysis and subsequent modeling. This dataset is valuable for building training datasets used as input for artificial intelligence models to perform short-, medium-, and long-term irradiance forecasting. For instance, Barco-Jiménez et al. (2021) utilized a portion of this dataset to develop multitemporal irradiance predictions. These predictive models can be applied in various domains, including energy management, grid optimization, and solar energy production planning. Furthermore, the dataset supports statistical analyses that provide insights for appropriately sizing photovoltaic systems through indicators such as Hours of Peak Sunlight (HPS), maximum and minimum irradiance values, average daily and monthly irradiance, and seasonal trends. These indicators play a fundamental role in the optimization of photovoltaic system performance, contributing to cost reduction and enhancing energy efficiency across rural, residential, and commercial applications. This dataset supports photovoltaic system design and studies on solar energy variability and climate patterns in the region. Analysis of irradiance fluctuations over time provides insights into the influence of atmospheric conditions on solar energy availability. This information is essential for enhancing the reliability of solar power systems and effectively integrating renewable energy sources into existing power grids. The dataset can also be used in educational settings to teach data analysis techniques and renewable energy concepts, providing students and researchers with a practical resource for hands-on learning.