Extending TCP for Accelerating Replication on Cluster File Systems over SDNs

Abstract

This paper explores the changes required of TCP to efficiently support cluster file systems such as Hadoop Distributed File System (HDFS) where the storage nodes are connected through a software defined networking (SDN). Traditional chain replications in these file systems incur large delay and cause inefficient network use. But SDN can cooperate with the cluster file systems to address the problems by pre-arranging a distribution tree, which opens the possibility of parallel replication. Unfortunately, it cannot be realized without extending TCP, to accommodate the parallel transfer on the transport layer. This paper discusses how to extend TCP to make it possible, and demonstrates the feasibility by implementing a prototype in the Linux kernel. The prototype saves the data replication time by 25% while substantially reducing network use.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…