Picturing Perceptions: An Open-Source Toolkit to Uncover Bias in Humans and Machines
Abstract
Bias in human judgment and artificial intelligence systems poses critical challenges across consequential domains like hiring, loans, and criminal justice. However, traditional bias measurement tools face fundamental limitations: they struggle to capture intersectional identities, cannot evaluate AI systems, lack grounding in demographic reality, and remain vulnerable to social desirability effects. We introduce PictoPercept, an open-source toolkit that measures bias through visual forced-choice comparisons grounded in population level benchmarks. Participants view pairs of normed facial photographs and assess who is more likely to have higher earnings, with selections compared against actual U.S. Bureau of Labor Statistics data. We validate PictoPercept with a nationally representative sample of 283 American adults and assess GPT-5, a mainstream generative model, using identical stimuli. Our study reveals three key findings: First, participants dramatically underestimate Asian American earnings despite this group having the highest actual earnings, while overestimating Latino male and White male earnings. Second, ingroup favoritism is not universal as White males show clear ingroup bias, but Asian participants actually underestimate their own group's earnings. Third, GPT-5 exhibits substantially stronger biases than humans, with stark systematic underestimation of all female groups. These findings suggest that PictoPercept enables unified bias assessment across human and AI systems while revealing systematic misperceptions that diverge from demographic reality.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.