Is Spurious Correlation Removal Always Learnable?

Abstract

Invariant learning can fail even when the invariant structure is statistically identifiable. We show a conditional computational barrier: under a black-box samplable supervised sparse recovery primitive motivated by average-case sparse-recovery reductions, there exist samplable multi-environment instances with a one-dimensional predictive invariant subspace (k=1) that are learnable with polynomial samples by exhaustive search, while any polynomial-time constant-accuracy recovery algorithm would contradict the primitive. We further quantify environment diversity by a separation parameter γ, which controls identifiability and the curvature of invariance objectives. Under sufficient diversity and local Gaussian regularity, the minimax risk is E[(V,Vinv)2]=Θ(k(d-k)/(n|E|)), and under label-induced shifts a phase transition occurs at n* k(d-k)/(|E|γ2) with refined estimation error scaling proportional to 1/γ2. Synthetic and real datasets illustrate the predicted gaps and transitions and motivate simple diversity diagnostics.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…