Insight into the influence of various cultivation regions on the identification of metabolites from Capsella bursa-pastoris via a clustering algorithm.
In Young Lee, Doo-Hee Lee, Min Ju Lee, Ju Hong Park, Nami Joo
Abstract
Open AccessThis study focused on the regional differentiation of metabolites of C. bursa-pastoris cultivated across Korea via UHPLC‒HRMS‒based untargeted metabolomics. Extensive screening was conducted on samples collected from five distinct sites in 20 cities, resulting in the identification of 311 primary and secondary metabolites. The samples were classified via six clustering techniques (K-means, agglomerative clustering, spectral clustering, Birch, mini-batch K-means, and bisecting K-means) via a clustering algorithm that is based on metabolite concentrations. Twenty key features that significantly influenced the clustering were extracted and validated. The classification results demonstrated a strong correlation with the geographical location of the cultivation site. C. bursa-pastoris from inland regions presented relatively high concentrations of sulfureous compounds, such as glucosinolic acid and isothiocyanate. The findings of this study provide valuable insights into the integration of machine learning techniques with untargeted metabolomics, facilitating the development of targeted phytochemical profiles.