Survey of Big Data sizes in 2021
Abstract
The modern increase in data production is driven by multiple factors, and several stakeholders from various sectors contribute to it. Although drawing a comparison of the sizes at stake for different big data players is hard due to the lack of official data, this report tries to reconstruct the yearly orders of magnitude generated by some of the most important organizations by mining several online sources. The estimation is based on retrieving meaningful unitary data production measures for each of the big data sources considered, and the yearly amounts are then obtained by conjecturing reasonable per-unit sizes. The final result is summarized in the form of a bubble plot.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.