A Machine-Learning-Based Prediction Model for Total Glycoalkaloid Accumulation in Yukon Gold Potatoes.
Saipriya Ramalingam, Diksha Singla, Mainak Pal Chowdhury, Michele Konschuh, Chandra Bhan Singh
Abstract
Open AccessPotatoes are the most extensively cultivated vegetable crop in Canada and rank as the fifth largest primary agricultural commodity. Given their diverse end uses and significant market value, particularly in processed forms, ensuring consistent quality from harvest to consumption is of critical importance. Total glycoalkaloids (TGA) are nitrogen-containing secondary metabolites that are known to accumulate in the tuber as an effect of greening in-field or elsewhere in the supply chain. In this study, 210 Yukon Gold (YG) potatoes were exposed to a constant light source to green over a period of 14 days and sampled in 7-day intervals. The samples were scanned using a short-wave infrared (SWIR) hyperspectral imaging camera in the 900-2500 nm wavelength range. Once individually scanned, pixel-wise spectral data was extracted and averaged for each tuber and matched with its respective ground truth TGA values which were obtained using a High-Performance Liquid Chromatography (HPLC) system. Prediction models using the partial least squares regression technique were developed from the extracted hyperspectral data and reference TGA values. Wavelength selection techniques such as competitive adaptive re-weighted sampling (CARS) and backward elimination (BE) were deployed to reduce the number of contributing wavelengths for practical applications. The best model resulted in a correlation coefficient of cross-validation (R2cv) of 0.72 with a root mean square error of cross-validation (RMSEcv) of 51.50 ppm.