Logarithmically larger deletion codes of all distances
Abstract
The deletion distance between two binary words u,v ∈ \0,1\n is the smallest k such that u and v share a common subsequence of length n-k. A set C of binary words of length n is called a k-deletion code if every pair of distinct words in C has deletion distance greater than k. In 1965, Levenshtein initiated the study of deletion codes by showing that, for k 1 fixed and n going to infinity, a k-deletion code C⊂eq \0,1\n of maximum size satisfies k(2n/n2k) ≤ |C| ≤ Ok( 2n/nk). We make the first asymptotic improvement to these bounds by showing that there exist k-deletion codes with size at least k(2n n/n2k). Our proof is inspired by Jiang and Vardy's improvement to the classical Gilbert--Varshamov bounds. We also establish several related results on the number of longest common subsequences and shortest common supersequences of a pair of words with given length and deletion distance.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.