On the combinatorics of the 2-class classification problem

Abstract

A set of points X = XB XR ⊂eq Rd is linearly separable if the convex hulls of XB and XR are disjoint, hence there exists a hyperplane separating XB from XR. Such a hyperplane provides a method for classifying new points, according to which side of the hyperplane the new points lie. When such a linear separation is not possible, it may still be possible to partition XB and XR into prespecified numbers of groups, in such a way that every group from XB is linearly separable from every group from XR. We may also discard some points as outliers, and seek to minimize the number of outliers necessary to find such a partition. Based on these ideas, Bertsimas and Shioda proposed the classification and regression by integer optimization (CRIO) method in 2007. In this work we explore the integer programming aspects of the classification part of CRIO, in particular theoretical properties of the associated formulation. We are able to find facet-inducing inequalities coming from the stable set polytope, hence showing that this classification problem has exploitable combinatorial properties.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…