DNA coding and G\"odel numbering

Abstract

Evolution consists of distinct stages: cosmological, biological, linguistic. Since biology verges on natural sciences and linguistics, we expect that it shares structures and features from both forms of knowledge. Indeed, in DNA we encounter the biological "atoms", the four nucleotide molecules. At the same time these four nucleotides may be considered as the "letters" of an alphabet. These four "letters", through a genetic code, generate biological "words", "phrases", "sentences" (aminoacids, proteins, cells, living organisms). In this spirit we may consider equally well a DNA strand as a mathematical statement. Inspired by the work of Kurt G\"odel, we attach to each DNA strand a G\"odel's number, a product of prime numbers raised to appropriate powers. To each DNA chain corresponds a single G\"odel's number G, and inversely given a G\"odel's number G, we can specify the DNA chain it stands for. Next, considering a single DNA strand composed of N bases, we study the statistical distribution of g, the logarithm of G. Our assumption is that the choice of the m-th term is random and with equal probability for the four possible outcomes. The "experiment", to some extent, appears as throwing N times a four-faces die. Through the moment generating function we obtain the discrete and then the continuum distribution of g. There is an excellent agreement between our formalism and simulated data. At the end we compare our formalism to actual data, to specify the presence of traces of non-random dynamics.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…