The impact of pre-processing techniques on deep learning breast image segmentation.
Jéssica Catarino, Nuno Cruz Garcia, Sara Silva, João Santinha
Abstract
Open AccessBreast cancer is one of the most common forms of cancer worldwide, making breast imaging a critical area for developing and evaluating Deep Learning methods. In this study, we investigate how different pre-processing techniques influence model performance in breast image segmentation. Pre-processing is a crucial step in the Deep Learning pipeline that directly impacts model performance, yet studies on its role in medical imaging remain limited. We assess the influence of different pre-processing techniques on a U-Net segmentation model applied to two breast public imaging datasets: CBIS-DDSM and Duke-Breast-Cancer-MRI. We systematically explored commonly used methods, including pixel intensity normalization, spacing harmonization, resizing/padding, and orientation standardization. Two processing pipelines were developed: Domain Non-Specific, integrating standard practices from natural and medical image analysis, and Domain Specific, which preserves anatomical information through careful handling of breast imaging metadata. A detailed comparative analysis of each pre-processing technique was conducted to evaluate its impact on model performance. Despite challenges and limitations associated with dataset size and scope, our findings identify pre-processing strategies tailored for breast imaging that can improve segmentation accuracy and analysis. This study represents an initial step in evaluating pre-processing for medical image analysis, providing a foundation for future work. Our results highlight significant differences in a 3-way ANOVA F-test ([Formula: see text]) for U-Net segmentation outcomes, attributed to different pixel intensity normalization approaches, offering valuable insights for future research.