More Clustering Quality Metrics for ABCDE

Abstract

ABCDE is a technique for evaluating clusterings of very large populations of items. Given two clusterings, namely a Baseline clustering and an Experiment clustering, ABCDE can characterize their differences with impact and quality metrics, and thus help to determine which clustering to prefer. We previously described the basic quality metrics of ABCDE, namely the GoodSplitRate, BadSplitRate, GoodMergeRate, BadMergeRate and DeltaPrecision, and how to estimate them on the basis of human judgements. This paper extends that treatment with more quality metrics. It describes a technique that aims to characterize the DeltaRecall of the clustering change. It introduces a new metric, called IQ, to characterize the degree to which the clustering diff translates into an improvement in the quality. Ideally, a large diff would improve the quality by a large amount. Finally, this paper mentions ways to characterize the absolute Precision and Recall of a single clustering with ABCDE.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…