FIT to Forget: Robust Continual Unlearning for Large Language Models

Haibo Hu

FIT to Forget: Robust Continual Unlearning for Large Language Models

Abstract

While large language models (LLMs) exhibit remarkable capabilities, they increasingly face demands to unlearn memorized privacy-sensitive, copyrighted, or harmful content. Existing unlearning methods primarily focus on single-shot scenarios, whereas real-world deletion requests arrive continually. Na\"ively applying these methods to sequential requests leads to severe utility degradation and catastrophic forgetting. To address this, we propose , a robust continual unlearning framework to process high-volume sequential deletion streams while resisting both catastrophic forgetting and post-unlearning recovery. stabilizes sequential updates through three synergistic mechanisms: redundancy Filtering, Importance-aware adaptive algorithm selection, and Targeted layer attribution. Furthermore, to facilitate rigorous evaluation, we introduce PCH, a unified benchmark encompassing Personal, Copyrighted, and Harmful content, alongside two symmetric metrics, Forget Degree (F.D.) and Retain Utility (R.U.), to systematically quantify forgetting-utility trade-offs. Extensive experiments across five LLMs (up to 14B parameters) demonstrate that consistently achieves state-of-the-art unlearning efficacy and utility preservation. Notably, even after hundreds of sequential requests, preserves strong downstream (, GSM8K, MMLU) performance and exhibits superior resilience against relearning and quantization recovery attacks.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…