Optimal Document Exchange and New Codes for Insertions and Deletions

Abstract

We give the first communication-optimal document exchange protocol. For any n and k < n our randomized scheme takes any n-bit file F and computes a (k nk)-bit summary from which one can reconstruct F, with high probability, given a related file F' with edit distance ED(F,F') ≤ k. The size of our summary is information-theoretically order optimal for all values of k, giving a randomized solution to a longstanding open question of [Orlitsky; FOCS'91]. It also is the first non-trivial solution for the interesting setting where a small constant fraction of symbols have been edited, producing an optimal summary of size O(H(δ)n) for k=δ n. This concludes a long series of better-and-better protocols which produce larger summaries for sub-linear values of k and sub-polynomial failure probabilities. In particular, the recent break-through of [Belazzougui, Zhang; FOCS'16] assumes that k < nε, produces a summary of size O(k2 k + k n), and succeeds with probability 1-(k n)-O(1). We also give an efficient derandomized document exchange protocol with summary size O(k 2 nk). This improves, for any k, over a deterministic document exchange protocol by Belazzougui with summary size O(k2 + k 2 n). Our deterministic document exchange directly provides new efficient systematic error correcting codes for insertions and deletions. These (binary) codes correct any δ fraction of adversarial insertions/deletions while having a rate of 1 - O(δ 2 1δ) and improve over the codes of Guruswami and Li and Haeupler, Shahrasbi and Vitercik which have rate 1 - (δ O(1) 1ε).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…