Power of d Choices with Simple Tabulation

Abstract

Suppose that we are to place m balls into n bins sequentially using the d-choice paradigm: For each ball we are given a choice of d bins, according to d hash functions h1,…,hd and we place the ball in the least loaded of these bins breaking ties arbitrarily. Our interest is in the number of balls in the fullest bin after all m balls have been placed. Azar et al. [STOC'94] proved that when m=O(n) and when the hash functions are fully random the maximum load is at most n d+O(1) whp (i.e. with probability 1-O(n-γ) for any choice of γ). In this paper we suppose that the h1,…,hd are simple tabulation hash functions. Generalising a result by Dahlgaard et al [SODA'16] we show that for an arbitrary constant d≥ 2 the maximum load is O( n) whp, and that expected maximum load is at most n d+O(1). We further show that by using a simple tie-breaking algorithm introduced by V\"ocking [J.ACM'03] the expected maximum load drops to nd d+O(1) where d is the rate of growth of the d-ary Fibonacci numbers. Both of these expected bounds match those of the fully random setting. The analysis by Dahlgaard et al. relies on a proof by Patrascu and Thorup [J.ACM'11] concerning the use of simple tabulation for cuckoo hashing. We need here a generalisation to d>2 hash functions, but the original proof is an 8-page tour de force of ad-hoc arguments that do not appear to generalise. Our main technical contribution is a shorter, simpler and more accessible proof of the result by Patrascu and Thorup, where the relevant parts generalise nicely to the analysis of d choices.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…