Black-Box Assisted Regression: Phase Transitions and Minimax Optimality

Abstract

Foundation models are often used as fixed black-box predictors for downstream tasks with limited labeled data, but their predictions may be biased and unsafe to trust blindly. We study this setting through black-box assisted nonparametric regression: a learner observes labeled samples and can query a fixed predictor f0, while the target f* is close to f0 in L2(PX) up to an unknown radius δ. We give a finite-sample minimax characterization showing a phase transition at δc(n) n-β/(2β+d), with leading risk \δ2, n-2β/(2β+d)\. We then analyze a Safe Residual Estimator: it learns a correction around f0, initializes the residual head at zero so the initial predictor equals f0, and uses holdout selection to revert to f0 when the learned correction is not supported by validation data. Here, "safe" means avoiding negative transfer, i.e., performing worse than the black-box predictor alone. The estimator matches the leading minimax term up to an additive validation-selection cost. Synthetic regression experiments verify the predicted phase transition, while CIFAR-100 with CLIP and AG News with Qwen3-8B provide practice-facing evidence that the same residual-correction tradeoff is useful beyond the formal squared-loss regression setting.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…