Reducing Nearest Neighbor Training Sets Optimally and Exactly
Abstract
In nearest-neighbor classification, a training set P of points in Rd with given classification is used to classify every point in Rd: Every point gets the same classification as its nearest neighbor in P. Recently, Eppstein [SOSA'22] developed an algorithm to detect the relevant training points, those points p∈ P, such that P and P\p\ induce different classifications. We investigate the problem of finding the minimum cardinality reduced training set P'⊂eq P such that P and P' induce the same classification. We show that the set of relevant points is such a minimum cardinality reduced training set if P is in general position. Furthermore, we show that finding a minimum cardinality reduced training set for possibly degenerate P is in P for d=1, and NP-complete for d≥ 2.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.