JSES international

Trustworthy deep learning for the automated quantification of the fatty infiltration of the rotator cuff muscles using magnetic resonance imaging.

Asma Salhi, Kristine Italia, Ignacio Viedma, Katreese Samsuya, Roberto Pareyon, Freek Hollman, Mohammad Jomaa, Helen Ingoe, Jashint Maharaj, Kenneth Cutbush, Ashish Gupta

Published: 202510.1016/j.jseint.2025.06.020

Abstract

Open Access

Background: The current method of classifying fatty infiltration is highly subjective and has low reliability, which may impact the decision-making for the management of rotator cuff tears. The purpose of this study was to present and evaluate a new deep-learning (DL) approach to automatically and objectively classify fatty infiltration of rotator cuff muscles on magnetic resonance imaging (MRI). Methods: A validated dataset of 1,149 images of segmented rotator cuff muscles, derived from 383 patients, were classified using a simplified grading system (normal, mild, severe) proposed based on the original Goutallier classification. These images and their classifications were used to train the artificial intelligence models. A novel DL pipeline comprising key components of in-domain transfer learning, feature fusion, and machine learning classifiers was proposed for automatic fatty infiltration classification. Pretrained DL models Xception, InceptionV3, and MobileNetV2 were trained separately. Then, K-Nearest Neighbor, Support Vector Machines, and Naive Bayes classifiers were trained using fused features extracted by 3 DL models from the delineated rotator cuff muscle areas. Performance metrics, including accuracy, precision, recall, F1-score, and Gradient-Weighted Class Activation Mapping visualizations, were used to evaluate the model's performance. Results: Among the individual models, MobileNetV2 demonstrated the highest overall performance, with accuracy of 89.5%, specificity of 94.7%, recall of 89.5%, precision of 90.5%, and F1-score of 90.0%. After feature fusion, K-Nearest Neighbour achieved the highest performance, with accuracy of 91.1%, specificity of 95.5%, recall of 91.1%, precision of 93.1%, and F1-score of 92.1%. Overall, the performance metrics of the feature fusion were higher compared to the individual models and approached the consistency of clinical experts (intraclass correlation coefficient 0.91). Conclusion: This study provides evidence for the effective utilization of artificial intelligence advancements in the automated classification of fatty infiltration of rotator cuff muscles on MRI using in-domain transfer learning, feature fusion, and machine learning classifiers. By combining the power of these 3 components, the proposed approach has excellent potential to achieve accurate, robust, and enhanced classification, with a level of consistency in line with expert agreement. As such, this approach offers a promising solution for automating the classification of fatty infiltration on MRI which may have potential benefit for daily clinical practice.