Critical properties of the SAT/UNSAT transitions in the classification problem of structured data
Abstract
The classification problem of structured data can be solved with different strategies: a supervised learning approach, starting from a labeled training set, and an unsupervised learning one, where only the structure of the patterns in the dataset is used to find a classification compatible with it. The two strategies can be interpreted as extreme cases of a semi-supervised approach to learn multi-view data, relevant for applications. In this paper I study the critical properties of the two storage problems associated with these tasks, in the case of the linear binary classification of doublets of points sharing the same label, within replica theory. While the first approach presents a SAT/UNSAT transition in a (marginally) stable replica-symmetric phase, in the second one the satisfiability line lies in a full replica-symmetry-broken phase. A similar behavior in the problem of learning with a margin is also pointed out.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.