Stay Unique, Stay Efficient: Preserving Model Personality in Multi-Task Merging
Abstract
Model merging has emerged as a promising paradigm for enabling multi-task capabilities without additional training. However, traditional basic merging methods often experience performance degradation due to parameter conflicts, even when applied to similar tasks. While recent personalized merging frameworks successfully preserve task-specific information to maintain performance, they typically incur storage overhead. In this paper, we propose Decomposition, Thresholding, and Scaling (DTS), an approximation-based personalized merging framework that pushes task-specific storage efficiency. DTS first applies singular value decomposition to the task-specific information and retains only a small subset of singular values and vectors. It then introduces a novel thresholding strategy that partitions singular vector elements into groups and assigns a scaling factor to each group. To enable generalization to unseen tasks, we further extend DTS with a variant that fuses task-specific information in a data-free manner based on the semantic similarity of task characteristics. Extensive experiments demonstrate that DTS consistently outperforms state-of-the-art baselines while requiring only 1\% extra storage per task. Furthermore, experiments on unseen tasks show that the DTS variant achieves significantly better generalization performance. Our code is available at https://github.com/krumpguo/DTS.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.