Multimodal Deep Learning Differentiates Papilledema and Non-Arteritic Anterior Ischemic Optic Neuropathy From Healthy Eyes.
David Szanto, Asala Erekat, Brian Woods, Jui-Kai Wang, Mona Garvin, Brett Johnson, Randy Kardon, Michael Wall, Edward Linton, Mark J Kupersmith
Abstract
Open AccessPurpose: Optic nerve head (ONH) swelling, a critical feature in idiopathic intracranial hypertension (IIH) and non-arteritic anterior ischemic optic neuropathy (NAION), can present diagnostic challenges. We explored a multimodal deep learning (DL) approach integrating optical coherence tomography (OCT) scans and fundus photographs to enhance diagnostic accuracy for differentiating IIH, NAION, and healthy eyes. Methods: We developed two separate models using 7019 OCT scans (3D-ResNet-18) and 17,657 fundus photos (ResNet-50) to classify eyes with papilledema (2315 OCT, 6349 fundus), NAION (841 OCT, 1814 fundus), and healthy eyes (3863 OCT, 9494 fundus). We arranged the dataset so that the test set consisted entirely of same-day OCT scans and fundus photos, with each modality (OCT and fundus) contributing at least 15% of the data for each class. We combined output probabilities from both models using two methods: an F1-weighted sum by class (F1WS), as well as an XGBoost model. Performance of each was evaluated with AUC-ROC, accuracy, precision, recall, and F1 scores. Results: The OCT model alone achieved a test accuracy of 93.5%, with the fundus photo model reaching 93.9%. The multimodal F1WS and XGBoost models achieved an accuracy of 97.5% and 98.3%, respectively. Conclusions: Combining OCT and fundus photographs improves the classification of IIH, NAION, and healthy eyes, showing the value of using complementary imaging modalities. This approach supports the use of DL to aid diagnosis and clinical management of optic nerve head swelling. It may also be extended to leverage DL from additional data sources, such as macular scans or visual field tests.