The dangers of using three-number summaries to estimate unknown standard deviations: sensitivity analyses and some possible improvements incorporating shape

Abstract

In recent years, there has been much progress toward the development of methods for converting three- and five-number summary statistics (i.e. minimum, maximum, median, and quartiles) to means and standard deviations (SDs). This is commonly done in the meta-analysis setting, where some studies report means and SDs, while other report quantile summaries. However, we show that three-number summaries, which are the most common, do not contain enough information to reliably estimate SDs. We show that very poor estimates can result, which may invalidate any inference and provide details of a sensitivity analysis that can allow researchers to have greater confidence in their results, or highlight potential sources of bias. We further explore whether nominating additional information can provide enough information regarding the unknown data shape to improve SD estimations, and in doing so introduce a new estimator using the scaled Beta distribution. Simulations and a real data example are used to highlight the advantages and disadvantages of this approach. A Web application is also provided to help researchers perform sensitivity analyses.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…