Decentralized EM Algorithm for Gaussian Mixtures under Data Heterogeneity and Partial Labeling

Hansheng Wang

Decentralized EM Algorithm for Gaussian Mixtures under Data Heterogeneity and Partial Labeling

Abstract

We systematically study several network-based Expectation-Maximization (EM) algorithms for the Gaussian mixture model within decentralized federated learning (DFL). Our theoretical investigation shows that directly extending the classic EM algorithm to DFL leads to a biased estimator when data are heterogeneously distributed across sites. To address this, we introduce a momentum network EM (MNEM) algorithm, which integrates information from both current and historical estimators from previous DFL iterations. We further develop a semi-supervised MNEM (semi-MNEM) algorithm, which utilizes information provided by partially labeled data. Rigorous theoretical analysis demonstrates that the MNEM estimator can achieve the same asymptotic efficiency as the whole-sample estimator under appropriate regularity conditions, even with heterogeneous data. Moreover, the semi-MNEM estimator significantly improves the convergence speed of the MNEM algorithm, even if different mixture components are poorly separated. Extensive simulations are conducted, and a widely used chest X-ray dataset is analyzed to demonstrate the finite-sample performance of the proposed methods.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Or compile a full topic from this idea

Discussion (0)

Sign in to join the discussion.

Loading comments…