On the Maximum Number of Non-Confusable Strings Evolving Under Short Tandem Duplications
Abstract
The set of all q -ary strings that do not contain repeated substrings of length ≤slant\! 3 (i.e., that do not contain substrings of the form a a , a b a b , and a b c a b c ) constitutes a code correcting an arbitrary number of tandem-duplication mutations of length ≤slant\! 3 . In other words, any two such strings are non-confusable in the sense that they cannot produce the same string while evolving under tandem duplications of length ≤slant\! 3 . We demonstrate that this code is asymptotically optimal in terms of rate, meaning that it represents the largest set of non-confusable strings up to subexponential factors. This result settles the zero-error capacity problem for the last remaining case of tandem-duplication channels satisfying the "root-uniqueness" property.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.