Automatic Detection and Diagnosis of Biased Online Experiments
Abstract
We have seen a massive growth of online experiments at LinkedIn, and in industry at large. It is now more important than ever to create an intelligent A/B platform that can truly democratize A/B testing by allowing everyone to make quality decisions, regardless of their skillset. With the tremendous knowledge base created around experimentation, we are able to mine through historical data, and discover the most common causes for biased experiments. In this paper, we share four of such common causes, and how we build into our A/B testing platform the automatic detection and diagnosis of such root causes. These root causes range from design-imposed bias, self-selection bias, novelty effect and trigger-day effect. We will discuss in detail what each bias is and the scalable algorithm we developed to detect the bias. Surfacing up the existence and root cause of bias automatically for every experiment is an important milestone towards intelligent A/B testing.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.