Universal Numerical Encoder and Profiler Reduces Computing's Memory Wall with Software, FPGA, and SoC Implementations
Abstract
In the multicore era, the time to computational results is increasingly determined by how quickly operands are accessed by cores, rather than by the speed of computation per operand. From high-performance computing (HPC) to mobile application processors, low multicore utilization rates result from the slowness of accessing off-chip operands, i.e. the memory wall. The APplication AXcelerator (APAX) universal numerical encoder reduces computing's memory wall by compressing numerical operands (integers and floats), thereby decreasing CPU access time by 3:1 to 10:1 as operands stream between memory and cores. APAX encodes numbers using a low-complexity algorithm designed both for time series sensor data and for multi-dimensional data, including images. APAX encoding parameters are determined by a profiler that quantifies the uncertainty inherent in numerical datasets and recommends encoding parameters reflecting this uncertainty. Compatible software, FPGA, and systemon-chip (SoC) implementations efficiently support encoding rates between 150 MByte/sec and 1.5 GByte/sec at low power. On 25 integer and floating-point datasets, we achieved encoding rates between 3:1 and 10:1, with average correlation of 0.999959, while accelerating computational "time to results."
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.