Speech Anonymization

With the growing use of voice data in areas like virtual assistants and remote healthcare,  speech anonymization  has become essential for protecting speaker privacy. The goal is to mask personal identity in spoken audio while still keeping the message clear and useful for downstream tasks like transcription or emotion detection.

Our research explores new techniques for anonymizing speech that go beyond traditional methods. These advanced approaches aim not only to hide identity but also to preserve important elements of speech such as  emotion, prosody (rhythm and tone), and unique vocal traits,  especially in voices affected by age or medical conditions. To achieve this, we combine  digital signal processing techniques with deep learning models , allowing precise control over audio transformations while adapting to the complexity of real-world speech. We also use  perception-inspired loss functions  that guide training based on how humans naturally perceive differences in voice, helping improve quality, clarity, and emotionally expressiveness.

Through extensive evaluation and user studies, we show that our methods significantly improve both  privacy protection and speech quality  across a wide range of voices, languages, and use cases—without compromising intelligibility. This work opens the door to more secure and ethical applications of voice technology in healthcare, accessibility, and everyday digital interactions.

Selected Publications

For more information or demo access, contact:  Suhita Ghosh

Last Modification: 11.04.2025 - Contact Person: Webmaster