One-Bit Quantization and Sparsification for Multiclass Linear Classification with Strong Regularization

Abstract

We study the use of linear regression for multiclass classification in the over-parametrized regime where some of the training data is mislabeled. In such scenarios it is necessary to add an explicit regularization term, λ f(w), for some convex function f(·), to avoid overfitting the mislabeled data. In our analysis, we assume that the data is sampled from a Gaussian Mixture Model with equal class sizes, and that a proportion c of the training labels is corrupted for each class. Under these assumptions, we prove that the best classification performance is achieved when f(·) = \|·\|22 and λ ∞. We then proceed to analyze the classification errors for f(·) = \|·\|1 and f(·) = \|·\|∞ in the large λ regime and notice that it is often possible to find sparse and one-bit solutions, respectively, that perform almost as well as the one corresponding to f(·) = \|·\|22.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…