Order-Explicit Linearization of High-Dimensional U-Statistics

Abstract

We give an order-explicit large deviation bound for the difference between a high-dimensional U-statistic and its Hájek projection. In particular, we show that any U-statistic of order b on n observations, with a d-dimensional kernel whose coordinates have ψ1-Orlicz norm at most ϕ, has a maximum deviation from its Hájek projection of order Op(ϕb n-12(dn)). The proof relies on the development of novel order-explicit moment inequalities for higher-order Hoeffding components. We show that this rate is unimprovable, up to the polynomial factor on the logarithmic term. As corollaries, we obtain new Bernstein-type concentration and Gaussian approximation results for high-dimensional U-statistics. We apply these results to establish the consistency of a set of resampling-based simultaneous confidence intervals built around a class of nonparametric regression estimators constructed with subsampled kernels. This class encompasses several forms of random forest regression, including Generalized Random Forests.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…