Outlier detection and a tail-adjusted boxplot based on extreme value theory

Abstract

Whether an extreme observation is an outlier or not, depends strongly on the corresponding tail behaviour of the underlying distribution. We develop an automatic, data-driven method to identify extreme tail behaviour that deviates from the intermediate and central characteristics. This allows for detecting extreme outliers or sets of extreme data that show less spread than the bulk of the data. To this end we extend a testing method proposed in Bhattacharya et al 2019 for the specific case of heavy tailed models, to all max-domains of attraction. Consequently we propose a tail-adjusted boxplot which yields a more accurate representation of possible outliers. Several examples and simulation results illustrate the finite sample behaviour of this approach.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…