Minimax Bounds for Distributed Logistic Regression

Abstract

We consider a distributed logistic regression problem where labeled data pairs (Xi,Yi)∈ Rd×\-1,1\ for i=1,…,n are distributed across multiple machines in a network and must be communicated to a centralized estimator using at most k bits per labeled pair. We assume that the data Xi come independently from some distribution PX, and that the distribution of Yi conditioned on Xi follows a logistic model with some parameter θ∈Rd. By using a Fisher information argument, we give minimax lower bounds for estimating θ under different assumptions on the tail of the distribution PX. We consider both 2 and logistic losses, and show that for the logistic loss our sub-Gaussian lower bound is order-optimal and cannot be improved.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…