Reliable fairness auditing with semi-supervised inference
Abstract
Machine learning (ML) models often exhibit bias that can exacerbate inequities in biomedical applications. Fairness auditing, the process of evaluating a model's performance across subpopulations, is critical for identifying and mitigating these biases. However, audits typically rely on large volumes of labeled data, which are costly and labor-intensive to obtain. To address this challenge, we introduce Infairness, a unified framework for auditing a wide range of fairness criteria using semi-supervised inference. Our approach combines a small labeled dataset with a large unlabeled dataset by imputing missing outcomes via regression with carefully selected nonlinear basis functions. Through extensive theoretical and empirical analyses, we show that our proposed estimator is (i) robust to specification of the ML or imputation model and (ii) substantially more efficient than supervised estimation based solely on the labeled data. In two real-world fairness audits using electronic health record and medical imaging data, Infairness reduces variance by approximately 50% compared to supervised estimation, underscoring its value for reliable fairness auditing with limited labeled data.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.