Fair and Square: Replacing One Real Multiplication with a Single Square and One Complex Multiplication with Three Squares When Performing Matrix Multiplication and Convolutions
Abstract
This paper shows that, for matrix multiplications and convolutions, it is possible to asymptotically replace each real multiplication with a single squaring operation. Similarly, a single complex multiplication can be replaced with 3 squaring operations. Given that an n bits squaring circuit requires about half the gate count of an nxn multiplier, this results in large resource reductions. With some caveats, the same techniques can apply to dot products, transformations and more. A varietiy of architectures implementing these ideas in hardware are described, including square based systolic arrays and tensor cores.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.