Predictive PAC learnability: a paradigm for learning from exchangeable input data
Abstract
Exchangeable random variables form an important and well-studied generalization of i.i.d. variables, however simple examples show that no nontrivial concept or function classes are PAC learnable under general exchangeable data inputs X1,X2,…. Inspired by the work of Berti and Rigo on a Glivenko--Cantelli theorem for exchangeable inputs, we propose a new paradigm, adequate for learning from exchangeable data: predictive PAC learnability. A learning rule L for a function class F is predictive PAC if for every ,δ>0 and each function f∈ F, whenever σ≥ s(δ,), we have with confidence 1-δ that the expected difference between f(Xn+1) and the image of fσ under L does not exceed conditionally on X1,X2,…,Xn. Thus, instead of learning the function f as such, we are learning to a given accuracy the predictive behaviour of f at the future points Xi(ω), i>n of the sample path. Using de Finetti's theorem, we show that if a universally separable function class F is distribution-free PAC learnable under i.i.d. inputs, then it is distribution-free predictive PAC learnable under exchangeable inputs, with a slightly worse sample complexity.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.