An Improved Sub-Packetization Bound for Minimum Storage Regenerating Codes
Abstract
Distributed storage systems employ codes to provide resilience to failure of multiple storage disks. Specifically, an (n, k) MDS code stores k symbols in n disks such that the overall system is tolerant to a failure of up to n-k disks. However, access to at least k disks is still required to repair a single erasure. To reduce repair bandwidth, array codes are used where the stored symbols or packets are vectors of length . MDS array codes have the potential to repair a single erasure using a fraction 1/(n-k) of data stored in the remaining disks. We introduce new methods of analysis which capitalize on the translation of the storage system problem into a geometric problem on a set of operators and subspaces. In particular, we ask the following question: for a given (n, k), what is the minimum vector-length or sub-packetization factor required to achieve this optimal fraction? For exact recovery of systematic disks in an MDS code of low redundancy, i.e. k/n > 1/2, the best known explicit codes WTB12 have a sub-packetization factor which is exponential in k. It has been conjectured TWB12 that for a fixed number of parity nodes, it is in fact necessary for to be exponential in k. In this paper, we provide a new log-squared converse bound on k for a given , and prove that k 22(δ+1), for an arbitrary number of parity nodes r = n-k, where δ = r/(r-1).
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.