The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning

Vishal Rajput

The Matching Principle: A Geometric Theory of Loss Functions for Nuisance-Robust Representation Learning

Abstract

Robustness, domain adaptation, photometric/occlusion invariance, sensor drift, and alignment style are treated as separate literatures with separate method families. Under label-preserving deployment shift they share one geometric object: the covariance Sigmatask = CovQn(n) of ways inputs can change without changing the label. CORAL, adversarial training, augmentation, metric learning, Jacobian penalties, and alignment constraints are not independent tricks--they are estimators of Sigmatask. Fix that object and the Jacobian penalty is pinned by a matrix Sigma' whose range must cover range(Sigmatask)--the matching principle. We prove optimality in a linear-Gaussian model (Thm. A), necessity of range coverage for any quadratic penalty that zeros deployment drift (Thm. G), and the same dichotomy at global minima (Thm. A*global). Wrong-direction/signal-aligned controls (Lemma C; Cor. E/E*) and seven estimators (Lemmas D1--D7), plus label-free TDI, yield a falsifiable recipe when Sigmatask must be learned. Thirteen blocks (ML through Qwen2.5-7B) test matched vs isotropic vs wrong-direction penalties on geometry and deployment drift. Twelve match theory where identifiability holds; Office-31 is a named eigengap failure. Partial passes: geometry can improve without every headline task metric moving. A pilot 7B DPO run (one epoch, 240 pairs): matched style-PMH preserves Style TDI where standard DPO degrades it. We do not claim standard training reaches global minima (assumption (O) is open), that estimated Sigmatask is always identifiable, or dominance on every leaderboard. We claim a falsifiable design recipe: estimate Sigmatask, match Sigma', run the controls, report task and geometry separately.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…