Generalizability vs. Counterfactual Explainability Trade-Off

Abstract

In this work, we investigate the relationship between model generalization and counterfactual explainability in supervised learning. We introduce the notion of -valid counterfactual probability (-VCP) -- the probability of finding perturbations of a data point within its -neighborhood that result in a label change. We provide a theoretical analysis of -VCP in relation to the geometry of the model's decision boundary, showing that -VCP tends to increase with model overfitting. Our findings establish a rigorous connection between poor generalization and the ease of counterfactual generation, revealing an inherent trade-off between generalization and counterfactual explainability. Empirical results validate our theory, suggesting -VCP as a practical proxy for quantitatively characterizing overfitting.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…