Exact Finite-Sample Variance Decomposition of Subagging: A Spectral Filtering Perspective
Abstract
Standard resampling ratios (e.g., α ≈ 0.632) are widely used as default baselines in ensemble learning for three decades. However, how these ratios interact with a base learner's intrinsic functional complexity in finite samples lacks a exact mathematical characterization. We leverage the Hoeffding-ANOVA decomposition to derive the first exact, finite-sample variance decomposition for subagging, applicable to any symmetric base learner without requiring asymptotic limits or smoothness assumptions. We establish that subagging operates as a deterministic low-pass spectral filter: it preserves low-order structural signals while attenuating c-th order interaction variance by a geometric factor approaching αc. This decoupling reveals why default baselines often under-regularize high-capacity interpolators, which instead require smaller α to exponentially suppress spurious high-order noise. To operationalize these insights, we propose a complexity-guided adaptive subsampling algorithm, empirically demonstrating that dynamically calibrating α to the learner's complexity spectrum consistently improves generalization over static baselines.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.