Causal Fuzzing for Verifying Machine Unlearning
Abstract
As machine learning models become increasingly embedded in decision-making systems, the ability to "unlearn" targeted data or features is crucial for enhancing model adaptability, fairness, and privacy in models which involves expensive training. To effectively guide machine unlearning, a thorough testing is essential. Existing methods for verification of machine unlearning provide limited insights, often failing in scenarios where the influence is indirect. In this work, we propose CAF\'E, a new causality based framework that unifies datapoint- and feature-level unlearning for verification of black-box ML models. CAF\'E evaluates both direct and indirect effects of unlearning targets through causal dependencies, providing actionable insights with fine-grained analysis. Our evaluation across five datasets and three model architectures demonstrates that CAF\'E successfully detects residual influence missed by baselines while maintaining computational efficiency.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.