Metaflow: A DAG-Based Network Abstraction for Distributed Applications
Abstract
In the past decade, increasingly network scheduling techniques have been proposed to boost the distributed application performance. Flow-level metrics, such as flow completion time (FCT), are based on the abstraction of flows yet they cannot capture the semantics of communication in a cluster application. Being aware of this problem, coflow is proposed as a new network abstraction. However, it is insufficient to reveal the dependencies between computation and communication. As a result, the real application performance can be hurt, especially in the absence of hard barriers. Based on the computation DAG of the application, we propose an expressive abstraction namely metaflow that resides in the middle of the two extreme points of flows and coflows. Evaluation results show that metaflow-based scheduling can outperform the coflow-based algorithm by 1.78x.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.