Decouple then Converge: Handling Unknown Unlabeled Distributions in Long-Tailed Semi-Supervised Learning

Abstract

While long-tailed semi-supervised learning (LTSSL) has attracted growing attention in many real-world classification tasks, existing LTSSL algorithms typically assume that labeled and unlabeled data share nearly identical class distributions. When this assumption is violated, these methods can perform poorly because they rely on biased model-generated pseudo-labels. To address this issue, we propose a simple yet effective approach called DeCon for LTSSL with unknown unlabeled class distributions. Specifically, DeCon decouples learning into two specialized branches: a standard branch that focuses on head classes and a balanced branch that focuses on tail classes. During training, the two branches interact and gradually converge, allowing them to complement each other and ultimately achieve strong performance across all classes. Despite its simplicity, we show that DeCon achieves state-of-the-art performance on a variety of standard LTSSL benchmarks, e.g., an averaged 2.7\% absolute increase in test accuracy against existing algorithms when the class distributions of labeled and unlabeled data are mismatched. Even when the class distributions are identical, DeCon consistently outperforms many sophisticated LTSSL algorithms. Furthermore, we conduct extensive ablation analyses to tease apart the factors that are the most important to the success of DeCon. The source code is available at https://github.com/Gank0078/DeCon.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…