DDPO-VC: Speaker De-Identification via Diffusion Denoising Policy Optimization

James Glass

DDPO-VC: Speaker De-Identification via Diffusion Denoising Policy Optimization

Abstract

A key challenge of speaker de-identification is the balance between privacy and utility. Many utility variables, such as the cognitive health status of the speaker, are correlated with the privacy variable, such as the speaker identity, violating the independence assumption held by the disentanglement-based approaches, causing leakage of private information and the loss of useful information for downstream tasks. To tackle this challenge, we propose a general framework, DDPO-VC, for speaker de-identification through reinforcement learning-based post-training with diffusion models. Learning from reward signals combining knowledge from privacy-focused and utility-focused teachers, our method outperforms various strong / methods in both privacy preservation and cognitive utility on two commonly used dementia speech benchmarks. Please check out our codehttps://github.com/cactuswiththoughts/DDPO-VChttps://github.com/cactuswiththoughts/DDPO-VC and demohttps://cactuswiththoughts.github.io/SpeakerDeID-Demo/https://cactuswiththoughts.github.io/SpeakerDeID-Demo/.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…