NPJ digital medicine
Navigating the tradeoff between personal privacy and data utility in speech anonymization for clinical research.
Catherine Diaz-Asper, Lars Ailo Bongo, Brita Elvevåg
Published: 202510.1038/s41746-025-01987-3
Abstract
Open AccessSpeech data inherently contains personally identifiable information. Anonymization strategies to obscure this while preserving essential characteristics all represent a tradeoff between privacy and utility. We examine this balancing act of modifying voice characteristics, masking identity, and eliminating identifiable content by showcasing challenges with the common techniques-generalization, suppression, anatomization, permutation, and perturbation-in the context of preserving utility for individual level speech data analyses in clinical research.