A multimodal approach for cardiac signals classification using deep learning with explainable AI methods.
Ali Mohammad Alqudah, Ausilah Alfraihat
Abstract
Open AccessCardiovascular diseases remain a leading cause of mortality worldwide, necessitating accurate and timely diagnosis. Electrocardiogram (ECG) and phonocardiogram (PCG) signals provide complementary information about cardiac function, electrical and mechanical activity, respectively. In this study, we propose a multimodal deep learning framework that integrates ECG and PCG using a dual-branch CNN-BiLSTM-SE architecture with cross-modal attention. Our preprocessing pipeline includes wavelet denoising, adaptive filtering, and normalization, with parameters tuned for each dataset's noise profile. We evaluate the model on multiple datasets: MIT-BIH Arrhythmia (47 subjects), PTB Diagnostic ECG (290 subjects), PhysioNet PCG Challenge 2016 (3126 subjects), PhysioNet PCG Challenge 2022 (942 subjects), and a custom multimodal dataset (500 subjects). The model achieves an overall accuracy of 97.0%, F1-scores ranging from 94.3% to 98.1%, and AUC values above 0.982 for all classes, outperforming single-modality and existing multimodal methods. Explainable AI techniques (SHAP, Grad-CAM, Integrated Gradients) reveal that the model focuses on clinically relevant features such as irregular R-R intervals in atrial fibrillation and systolic murmurs in valvular disease. The proposed approach offers a feasible, interpretable, and accurate decision-support system for cardiac diagnosis.