Universal guarantees for decision tree induction via a higher-order splitting criterion

Abstract

We propose a simple extension of top-down decision tree learning heuristics such as ID3, C4.5, and CART. Our algorithm achieves provable guarantees for all target functions f: \-1,1\n \-1,1\ with respect to the uniform distribution, circumventing impossibility results showing that existing heuristics fare poorly even for simple target functions. The crux of our extension is a new splitting criterion that takes into account the correlations between f and small subsets of its attributes. The splitting criteria of existing heuristics (e.g. Gini impurity and information gain), in contrast, are based solely on the correlations between f and its individual attributes. Our algorithm satisfies the following guarantee: for all target functions f : \-1,1\n \-1,1\, sizes s∈ N, and error parameters ε, it constructs a decision tree of size sO(( s)2/ε2) that achieves error O(opts) + ε, where opts denotes the error of the optimal size s decision tree. A key technical notion that drives our analysis is the noise stability of f, a well-studied smoothness measure.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…