Communication-Optimal Parallel Standard and Karatsuba Integer Multiplication in the Distributed Memory Model

Abstract

We present COPSIM a parallel implementation of standard integer multiplication for the distributed memory setting, and COPK a parallel implementation of Karatsuba's fast integer multiplication algorithm for a distributed memory setting. When using P processors, each equipped with a local non-shared memory, to compute the product of tho n-digits integer numbers, under mild conditions, our algorithms achieve optimal speedup of the computational time. That is, O(n2/P) for COPSIM, and O(n2 3/P) for COPK. The total amount of memory required across the processors is O(n), that is, within a constant factor of the minimum space required to store the input values. We rigorously analyze the Input/Output (I/O) cost of the proposed algorithms. We show that their bandwidth cost (i.e., the number of memory words sent or received by at least one processors) matches asymptotically corresponding known I/O lower bounds, and their latency (i.e., the number of messages sent or received in the algorithm's critical execution path) is asymptotically within a multiplicative factor O(22 P) of the corresponding known I/O lower bounds. Hence, our algorithms are asymptotically optimal with respect to the bandwidth cost and almost asymptotically optimal with respect to the latency cost.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…