Full-conformal novelty detection

Abstract

This paper presents a powerful methodology for flexible full-data nonparametric novelty detection that offers distribution-free false discovery rate (FDR) control guarantees. Building on the full conformal inference framework and the concept of e-values, we introduce full conformal e-values to quantify evidence for novelty relative to a given reference dataset. These e-values are then utilized by carefully crafted multiple testing procedures to identify a set of novel units out-of-sample with provable finite-sample FDR control. We showcase several instantiations of e-values, including those which employ a data-driven model selection strategy to amplify power. Furthermore, our framework is extended to address distribution shift, accommodating scenarios where novelty detection must be performed on data drawn from a shifted distribution relative to the reference dataset. In all settings, our method can perform powerfully -- outperforming existing novelty detection methods -- even with limited amounts of reference data; this is illustrated by empirical evaluations on synthetic data and an application to a malicious LLM prompts dataset.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…