A dual-modality machine learning precision diagnostic model integrated radiomics and proteomics for breast cancer.
Pengping Li, Yuan Liu, Ren Liu, Yuqin Huang, Ke Sun, Kexin Yin, Jiajia Lu, Lanqing Li, Shuirong Zhang, Claire Y Tong, Jiayi Liu, Junli Gao, Zhenyu Wang
Abstract
Open AccessBackground: This study aims to construct a dual-modal machine learning model that integrates ultrasound radiomics and plasma proteomics for the precise diagnosis of breast cancer. Methods: Using a multi-source data integration strategy, 10 protein markers and 14 ultrasound radiomics features were screened from the TCGA, CPTAC databases, and the clinical cohort (including 60 healthy controls, 60 cases of benign breast diseases, and 60 cases of breast cancer) based on plasma protein mass spectrometry and ultrasound data. A dual-modal diagnostic model was constructed in combination with machine learning algorithms. Results: The results showed that the protein marker detection model performed outstandingly in the primary screening of healthy people and breast diseases (with the highest AUC of 0.974). Still, its diagnostic performance was limited in differentiating benign and malignant diseases (AUC<0.8 under multiple algorithms). The bimodal model demonstrated excellent performance (AUC = 0.938) in differentiating benign and malignant lesions, significantly outperforming the single proteomics model (AUC = 0.830) and the radiomics model (AUC = 0.841). Conclusion: This study confirmed for the synergistic diagnostic value of plasma proteins and ultrasound images, providing a new strategy with both accuracy and accessibility for stratified diagnosis of breast cancer.