Guess & Check Codes for Deletions, Insertions, and Synchronization

Abstract

We consider the problem of constructing codes that can correct δ deletions occurring in an arbitrary binary string of length n bits. Varshamov-Tenengolts (VT) codes, dating back to 1965, are zero-error single deletion (δ=1) correcting codes, and have an asymptotically optimal redundancy. Finding similar codes for δ ≥ 2 deletions remains an open problem. In this work, we relax the standard zero-error (i.e., worst-case) decoding requirement by assuming that the positions of the δ deletions (or insertions) are independent of the codeword. Our contribution is a new family of explicit codes, that we call Guess & Check (GC) codes, that can correct with high probability up to a constant number of δ deletions (or insertions). GC codes are systematic; and have deterministic polynomial time encoding and decoding algorithms. We also describe the application of GC codes to file synchronization.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…