Selective Risk Certification for LLM Outputs via Information-Lift Statistics: PAC-Bayes, Robustness, and Skeleton Design
Abstract
Large language models often produce confident but incorrect outputs, creating a critical need for reliable uncertainty quantification with formal abstention guarantees. We introduce information-lift certificates that compare model probabilities to a skeleton baseline, accumulating evidence through sub-gamma PAC-Bayes bounds that remain valid under heavy-tailed distributions where standard concentration inequalities fail. On eight diverse datasets, our method achieves 77.0\% coverage at 2\% risk, outperforming recent baselines by 10.0 percentage points on average. In high-stakes scenarios, we block 96\% of critical errors compared to 18-31\% for entropy-based methods. While our frequency-based certification does not guarantee severity-weighted safety and depends on skeleton quality, performance degrades gracefully under distributional shifts, making the approach practical for real-world deployment.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.