The Storage vs Repair Bandwidth Trade-off for Multiple Failures in Clustered Storage Networks
Abstract
We study the trade-off between storage overhead and inter-cluster repair bandwidth in clustered storage systems, while recovering from multiple node failures within a cluster. A cluster is a collection of m nodes, and there are n clusters. For data collection, we download the entire content from any k clusters. For repair of t ≥ 2 nodes within a cluster, we take help from local nodes, as well as d helper clusters. We characterize the optimal trade-off under functional repair, and also under exact repair for the minimum storage and minimum inter-cluster bandwidth (MBR) operating points. Our bounds show the following interesting facts: 1) When t|(m-) the trade-off is the same as that under t=1, and thus there is no advantage in jointly repairing multiple nodes, 2) When t (m-), the optimal file-size at the MBR point under exact repair can be strictly less than that under functional repair. 3) Unlike the case of t=1, increasing the number of local helper nodes does not necessarily increase the system capacity under functional repair.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.