The Complexity of Finding Local Optima in Contrastive Learning
Abstract
Contrastive learning is a powerful technique for discovering meaningful data representations by optimizing objectives based on contrastive information, often given as a set of weighted triplets \(xi, yi+, zi-)\i = 1m indicating that an "anchor" xi is more similar to a "positive" example yi than to a "negative" example zi. The goal is to find representations (e.g., embeddings in Rd or a tree metric) where anchors are placed closer to positive than to negative examples. While finding global optima of contrastive objectives is NP-hard, the complexity of finding local optima -- representations that do not improve by local search algorithms such as gradient-based methods -- remains open. Our work settles the complexity of finding local optima in various contrastive learning problems by proving PLS-hardness in discrete settings (e.g., maximize satisfied triplets) and CLS-hardness in continuous settings (e.g., minimize Triplet Loss), where PLS (Polynomial Local Search) and CLS (Continuous Local Search) are well-studied complexity classes capturing local search dynamics in discrete and continuous optimization, respectively. Our results imply that no polynomial time algorithm (local search or otherwise) can find a local optimum for various contrastive learning problems, unless PLS⊂eqP (or CLS⊂eq P for continuous problems). Even in the unlikely scenario that PLS⊂eqP (or CLS⊂eq P), our reductions imply that there exist instances where local search algorithms need exponential time to reach a local optimum, even for d=1 (embeddings on a line).
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.