How a Small Amount of Data Sharing Benefits Distributed Optimization and Learning : The Upside of Data Heterogeneity

Yinyu Ye

How a Small Amount of Data Sharing Benefits Distributed Optimization and Learning : The Upside of Data Heterogeneity

Abstract

Distributed optimization algorithms are widely used in machine learning. This paper investigates how a small amount of data sharing can improve their performance. Focusing on general linear models, we analyze the effects of data sharing on both primal and primal-dual optimization methods. Our contributions are threefold. First, from a theoretical perspective, we show that minimal data sharing improves algorithmic performance by shifting data from less favorable to more favorable structures. Contrary to the common belief that data heterogeneity is always harmful, we prove that while heterogeneity generally slows convergence in primal methods such as FedAvg and distributed PCG, it can accelerate convergence in primal-dual consensus algorithms like distributed ADMM, Fed-ADMM, and EXTRA by enriching dual dynamics. This reveals a form of duality in how heterogeneity affects different algorithm families. Second, building on this insight, we design a meta-algorithm for minimal data sharing, adaptable to both primal and primal-dual methods. We show that with as little as 1 percent shared data, convergence can be significantly accelerated across machine learning tasks. Finally, we argue from a broader perspective that even limited collaboration can yield large synergies, an idea that transcends the optimization context. Our findings provide both theoretical and practical guidance for improving distributed learning through minimal cooperation and motivate further exploration of cross-agent collaboration in solving complex global learning problems.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…