Measuring the Impact of Missingness in Traffic Stop Data
Abstract
In this article we explore the data available through the Stanford Open Policing Project. The data consist of information on millions of traffic stops across close to 100 different cities and highway patrols. Using a variety of metrics, we identify that the data is not missing completely at random. Furthermore, we develop ways of quantifying and visualizing missingness trends for different variables across the data sets. Because of the way we have identified the missingness dependence on key variables, imputation is not possible and a sensitivity analysis is required. We perform a sensitivity analysis to extend work done on the outcome test as well as to extend work done on sharp bounds on the average treatment effect. We demonstrate that bias calculations can fundamentally shift depending on the assumptions made about the observations for which the race variable has not been recorded. We suggest ways that our missingness sensitivity analysis can be extended to myriad contexts.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.