CPSMI2025: A curated dataset of conventional Pap smear microscopy images for deep learning-based cervical cancer screening.
José Ocampo-López-Escalera, Martha Rosete-Aguilar, Héctor Ochoa-Díaz-López, Xariss M Sánchez-Chino, César A Irecta-Nájera, F Guillermo Domínguez-Gómez, Saúl D Tobar-Alas
Abstract
Open AccessCervical cancer remains one of the most common and deathly cancers in low-resource regions, where conventional Pap smear screening - though affordable - faces diagnostic bottlenecks due to limited specialist availability. To support the development of automated screening tools, we present CPSMI2025: a curated dataset of 2169 high-resolution Pap smear microscopy images representing nine clinically relevant cytology categories, derived from >350 manually screened slides obtained from Hospital General de Zona No 2 (IMSS) and a private pathology practice in Tuxtla Gutiérrez, Chiapas, Mexico. All slides were pre-classified by both a pathologist and cytotechnologist. Images were captured using a replicable open-source low-cost microscopy platform.