Optimized-Cost Repair in Multi-hop Distributed Storage Systems with Network Coding
Abstract
In distributed storage systems reliability is achieved through redundancy stored at different nodes in the network. Then a data collector can reconstruct source information even though some nodes fail. To maintain reliability, an autonomous and efficient protocol should be used to repair the failed node. The repair process causes traffic and consequently transmission cost in the network. Recent results found the optimal trafficstorage tradeoff, and proposed regenerating codes to achieve the optimality. We aim at minimizing the transmission cost in the repair process. We consider the network topology in the repair, and accordingly modify information flow graphs. Then we analyze the cut requirement and based on the results, we formulate the minimum-cost as a linear programming problem for linear costs. We show that the solution of the linear problem establishes a fundamental lower bound of the repair-cost. We also show that this bound is achievable for minimum storage regenerating, which uses the optimal-cost minimum-storage regenerating (OCMSR) code. We propose surviving node cooperation which can efficiently reduce the repair cost. Further, the field size for the construction of OCMSR codes is discussed. We show the gain of optimal-cost repair in tandem, star, grid and fully connected networks.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.