Efficient Elastic Net Regularization for Sparse Linear Models

Abstract

This paper presents an algorithm for efficient training of sparse linear models with elastic net regularization. Extending previous work on delayed updates, the new algorithm applies stochastic gradient updates to non-zero features only, bringing weights current as needed with closed-form updates. Closed-form delayed updates for the 1, ∞, and rarely used 2 regularizers have been described previously. This paper provides closed-form updates for the popular squared norm 22 and elastic net regularizers. We provide dynamic programming algorithms that perform each delayed update in constant time. The new 22 and elastic net methods handle both fixed and varying learning rates, and both standard stochastic gradient descent (SGD) and forward backward splitting (FoBoS). Experimental results show that on a bag-of-words dataset with 260,941 features, but only 88 nonzero features on average per training example, the dynamic programming method trains a logistic regression classifier with elastic net regularization over 2000 times faster than otherwise.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…