Optimal repair of Reed-Solomon codes: Achieving the cut-set bound
Abstract
Coding for distributed storage gives rise to a new set of problems in coding theory related to the need of reducing inter-node communication in the system. A large number of recent papers addressed the problem of optimizing the total amount of information downloaded for repair of a single failed node (the repair bandwidth) by accessing information on d helper nodes, where k d n-1. By the so-called cut-set bound (Dimakis et al., 2010), the repair bandwidth of an (n,k=n-r) MDS code using d helper nodes is at least dl/(d+1-k), where l is the size of the node. Also, a number of known constructions of MDS array codes meet this bound with equality. In a related but separate line of work, Guruswami and Wootters (2016) studied repair of Reed-Solomon (RS) codes, showing that these codes can be repaired using a smaller bandwidth than under the trivial approach. At the same time, their work as well as follow-up papers stopped short of constructing RS codes (or any scalar MDS codes) that meet the cut-set bound with equality, which has been an open problem in coding theory. In this work we present a solution to this problem, constructing RS codes of length n over the field ql, l=((1+o(1))n n) that meet the cut-set bound. We also prove an almost matching lower bound on l, showing that the super-exponential scaling is both necessary and sufficient for achieving the cut-set bound using linear repair schemes. More precisely, we prove that for scalar MDS codes (including the RS codes) to meet this bound, the sub-packetization l must satisfy l ((1+o(1)) k k).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.