Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces

Abstract

In the standard trace reconstruction problem, the goal is to exactly reconstruct an unknown source string x ∈ \0,1\n from independent "traces", which are copies of x that have been corrupted by a δ-deletion channel which independently deletes each bit of x with probability δ and concatenates the surviving bits. We study the approximate trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of x rather than an exact reconstruction. We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string x ∈ \0,1\n from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate 0 < δ < 1 (which may depend on n), for almost every source string x ∈ \0,1\n, given any number M ≤ (1/δ) of traces from Delδ(x), the algorithm constructs a hypothesis string x that has edit distance at most n · (δ M)(M) from x. We also prove a near-matching information-theoretic lower bound showing that given M ≤ (1/δ) traces from Delδ(x) for a random n-bit string x, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is n · (δ M)O(M).

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…