Almost Optimal Tensor Sketch
Abstract
We construct a matrix M∈ Rm dc with just m=O(c\,λ\,-2poly1/δ) rows, which preserves the norm \|Mx\|2=(1)\|x\|2 of all x in any given λ dimensional subspace of Rd with probability at least 1-δ. This matrix can be applied to tensors x(1)… x(c)∈ Rdc in O(c\, m \d,m\) time -- hence the name "Tensor Sketch". (Here x y = asvec(xyT) = [x1y1, x1y2,…,x1ym,x2y1,…,xnym]∈ Rnm.) This improves upon earlier Tensor Sketch constructions by Pagh and Pham~[TOCT 2013, SIGKDD 2013] and Avron et al.~[NIPS 2014] which require m=(3cλ2δ-1) rows for the same guarantees. The factors of λ, -2 and 1/δ can all be shown to be necessary making our sketch optimal up to log factors. With another construction we get λ times more rows m= O(c\,λ2\,-2(1/δ)3), but the matrix can be applied to any vector x(1)… x(c)∈ Rdc in just O(c\, (d+m)) time. This matches the application time of Tensor Sketch while still improving the exponential dependencies in c and 1/δ. Technically, we show two main lemmas: (1) For many Johnson Lindenstrauss (JL) constructions, if Q,Q'∈ Rm× d are independent JL matrices, the element-wise product Qx Q'y equals M(x y) for some M∈ Rm× d2 which is itself a JL matrix. (2) If M(i)∈ Rm× md are independent JL matrices, then M(1)(x (M(2)y …)) = M(x y …) for some M∈ Rm× dc which is itself a JL matrix. Combining these two results give an efficient sketch for tensors of any size.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.