Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?
Abstract
Low-rank approximation is a common tool used to accelerate kernel methods: the n × n kernel matrix K is approximated via a rank-k matrix K which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error k-rank approximation to K is at least as difficult as multiplying the input data matrix A ∈ Rn × d by an arbitrary matrix C ∈ Rd × k. Barring a breakthrough in fast matrix multiplication, when k is not too large, this requires (nnz(A)k) time where nnz(A) is the number of non-zeros in A. This lower bound matches, in many parameter regimes, recent work on subquadratic time algorithms for low-rank approximation of general kernels [MM16,MW17], demonstrating that these algorithms are unlikely to be significantly improved, in particular to O(nnz(A)) input sparsity runtimes. At the same time there is hope: we show for the first time that O(nnz(A)) time approximation is possible for general radial basis function kernels (e.g., the Gaussian kernel) for the closely related problem of low-rank approximation of the kernelized dataset.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.