Making it to First: The Random Access Problem in DNA Storage
Abstract
In this paper, we study the Random Access Problem in DNA storage, which addresses the challenge of retrieving a specific information strand from a DNA-based storage system. In this framework, the data is represented by k information strands which represent the data and are encoded into n strands using a linear code. Then, each sequencing read returns one encoded strand which is chosen uniformly at random. The goal under this paradigm is to design codes that minimize the expected number of reads required to recover an arbitrary information strand. We fully solve the case when k=2, showing that the best possible code attains a random access expectation of 1+22+1≈ 0.914· 2 for q large enough. Moreover, we generalize a construction from~GMZ24, specifically to k=3, for any value of k. Our construction uses Bk-1 sequences over Zq-1, that always exist over large finite fields. We show that for every k≥ 4, this generalized construction outperforms all previous constructions in terms of reducing the random access expectation.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.