Conformal prediction under ambiguous ground truth
Abstract
Conformal Prediction (CP) allows to perform rigorous uncertainty quantification by constructing a prediction set C(X) satisfying P(Y ∈ C(X))≥ 1-α for a user-chosen α ∈ [0,1] by relying on calibration data (X1,Y1),...,(Xn,Yn) from P=PX PY|X. It is typically implicitly assumed that PY|X is the "true" posterior label distribution. However, in many real-world scenarios, the labels Y1,...,Yn are obtained by aggregating expert opinions using a voting procedure, resulting in a one-hot distribution PvoteY|X. For such ``voted'' labels, CP guarantees are thus w.r.t. Pvote=PX PvoteY|X rather than the true distribution P. In cases with unambiguous ground truth labels, the distinction between Pvote and P is irrelevant. However, when experts do not agree because of ambiguous labels, approximating PY|X with a one-hot distribution PvoteY|X ignores this uncertainty. In this paper, we propose to leverage expert opinions to approximate PY|X using a non-degenerate distribution PaggY|X. We develop Monte Carlo CP procedures which provide guarantees w.r.t. Pagg=PX PaggY|X by sampling multiple synthetic pseudo-labels from PaggY|X for each calibration example X1,...,Xn. In a case study of skin condition classification with significant disagreement among expert annotators, we show that applying CP w.r.t. Pvote under-covers expert annotations: calibrated for 72\% coverage, it falls short by on average 10\%; our Monte Carlo CP closes this gap both empirically and theoretically.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.