$\gamma$-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

Zhi Wang

γ-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

Abstract

Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top-k, exhibit high computational complexity, up to O(d2k), where d is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to O(d). However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize γ is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose γ-FedHT, a stepsize-aware low-cost compressor with Error-Feedback to guarantee convergence. Given that the traditional theoretical framework of FL does not consider Error-Feedback, we introduce the fundamental conversation of Error-Feedback. We prove that γ-FedHT has the convergence rate of O(1T) (T representing total training iterations) under μ-strongly convex cases and O(1T) under non-convex cases, same as FedAVG. Extensive experiments demonstrate that γ-FedHT improves accuracy by up to 7.42\% over Top-k under equal communication traffic on various non-IID image datasets.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…