iMacHSR: Intermediate Multi-Access Heterogeneous Supervision and Regularization Scheme Toward Architecture-Agnostic Training

Abstract

While deep supervision is a powerful training strategy by supervising intermediate layers with auxiliary losses, it faces three underexplored problems: (I) Existing deep supervision techniques are generally bond with specific model architectures strictly, lacking generality. (II) The identical loss function for intermediate and output layers causes intermediate layers to prioritize output-specific features prematurely, limiting generalizable representations. (III) Lacking regularization on hidden activations risks overconfident predictions, reducing generalization to unseen scenarios. To tackle these challenges, we propose an architecture-agnostic, intermediate Multi-access Heterogeneous Supervision and Regularization (iMacHSR) scheme. Specifically, the proposed iMacHSR introduces below integral strategies: (I) we select multiple intermediate layers based on predefined architecture-agnostic standards; (II) loss functions (different from output-layer loss) are applied to those selected intermediate layers, which can guide intermediate layers to learn diverse and hierarchical representations; and (III) negative entropy regularization on selected layers' hidden features discourages overconfident predictions and mitigates overfitting. These intermediate terms are combined into the output-layer training loss to form a unified optimization objective, enabling comprehensive optimization across the network hierarchy. We then take the semantic understanding task as an example to assess iMacHSR and apply iMacHSR to several model architectures. Extensive experiments on multiple datasets demonstrate that iMacHSR outperforms conventional output-layer single-point supervision method up to 9.19% in mIoU.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…