Attention-Based Multimodal Deep Learning for Uveal Melanoma Classification Using Ultra-Widefield Fundus Images and Ocular Ultrasound.
Albert K Dadzie, Sabrina P Iddir, Mansour Abtahi, Behrouz Ebrahimi, Mojtaba Rahimi, Sanjay Ganesh, Taeyoon Son, Michael J Heiferman, Xincheng Yao
Abstract
Open AccessPurpose: To develop and evaluate a deep learning model that integrates ultra-widefield fundus photography and B-scan ultrasonography for automated classification of uveal melanoma (UM) and choroidal nevi. Design: A retrospective cross-sectional study. Subjects: This study included 174 patients (93 with UM and 81 with choroidal nevi) diagnosed at a tertiary eye center. For each patient, ultra-widefield fundus photographs and B-scan ultrasound images in both transverse and longitudinal orientations were acquired. Methods: A deep learning model was trained using ultra-widefield fundus photography, ultrasound images, and combinations of both. Fivefold cross-validation was used to evaluate model performance. Main Outcome Measures: The deep learning models were evaluated using accuracy, F1 score, and area under the receiver operating characteristic curve (AUC). Results: Uveal melanomas had a mean thickness of 6.0 mm and a basal diameter of 12.6 mm, whereas nevi measured 1.8 mm and 6.5 mm, respectively. Among single-modality models, the model trained on transverse ultrasound images achieved the highest performance (accuracy: 92%; F1 score: 0.9227; AUC: 0.9538). Averaging predictions from the single-modality models provided only modest gains because their outputs sometimes conflicted. In contrast, the model that combined fundus photographs and ultrasound images using an attention mechanism achieved the highest overall performance (accuracy: 94%; F1 score: 0.9445; AUC: 0.9606), outperforming all other configurations by effectively integrating complementary information from both modalities. Conclusions: Multimodal deep learning that combines fundus photography and ultrasound imaging improves the classification of UM and choroidal nevi. This approach demonstrates feasibility for leveraging the strengths of each modality for automated classification of UM and choroidal nevi. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.