Identifying key features for determining the patterns of patients with functional dyspepsia using machine learning.
Heeyoung Moon, Da-Eun Yoon, Junsuk Kim, Younkuk Choi, Heekyung Kim, In-Seon Lee, Younbyoung Chae
Abstract
Open AccessBackground and aims: Pattern identification (PI) provides a basis for understanding disease symptoms and signs. The aims of this study are to extract features for identifying conventional PI types from the questionnaire data of patients with functional dyspepsia (FD) through supervised learning methods, and to compare them with another set of features for novel PI types identified with unsupervised learning. Methods: In total, 153 patients with FD were invited to complete the Standardized Tool for Pattern Identification of Functional Dyspepsia (STPI-FD) questionnaire. Supervised analysis using support vector machine (SVM) was conducted to extract the most discriminative features. For unsupervised analysis, t-distributed stochastic neighbor embedding (t-SNE) and k-means clustering were applied to detect novel subgroups, and independent-samples t-tests were performed to identify distinguishing features between clusters. Results: The SVM identified loss of appetite, flank discomfort, abdominal bloating or gurgling, and pale or yellowish complexion as the most discriminative features. Unsupervised clustering revealed four distinct patient subgroups with differing predominant symptom profiles, such as systemic symptoms, upper abdominal symptoms, changed bowel movement, and nausea/vomiting. Conclusion: Through supervised learning, we identified the most important features for PI. Additionally, we proposed a novel unsupervised learning approach for identifying patterns from the patient data. This study could facilitate clinical decision making as it pertains to patients with FD.