Bernstein-Schur Kernels: Random Features by Sketched Modulation and Radial Randomization

Abstract

Bernstein--Schur kernels are products of a finite-feature kernel and a completely monotone shift-invariant kernel: nonstationary kernels falling between the shift-invariant and dot-product templates random features exploit, so neither Bochner sampling nor polynomial sketching applies to the full kernel directly. We give one random-feature construction for the whole class that randomizes both factors: it sketches the finite modulation and samples the radial factor's one-dimensional Bernstein--Widder scale before applying Gaussian random Fourier features, giving feature dimension Dm, free of the O(d2) size of the exact modulation feature. With the modulation kept exact (the m∞ limit), we prove unbiasedness, an exact variance, and a matrix-Bernstein operator-norm bound controlled by the top kernel and modulation eigenvalues and an intrinsic dimension rather than the crude Nij route. Whitening this argument at the ridge makes the effective dimension deff(λ) the exact intrinsic dimension of the matrix variance, so O((1+\|P\|op/λ)(deff/δ)) radial draws preserve the kernel-ridge solution; tilting the draw by a closed-form whitened leverage improves this to the effective-dimension count O((1+deff)(deff/δ)). Conditioning on the sketch carries every guarantee to the deployed doubly-randomized estimator up to one additive sketch term, and all hold for the whole class with the modulation Gram in place of the polynomial one. The flagship instance is the biased yat-kernel kyat,b(w,x)=(w x+b)2/(\|w-x\|2+), whose family span contains the inverse-multiquadric kernel by finite differences in b.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…