Relations between the properties of a complete rooted tree and the properties of a distribution of lengths of randomly generated strings

Abstract

Let's denote a complete m-ary rooted tree graph of height n as G. In scope of this paper we prove the certain relations between the properties of G and the expectation and variance of the distribution of lengths of strings, generated as follows: starting from an empty string we pick a random symbol from the alphabet \ α1, α2, … αm \ and append it to the string, the process continues until we see n instances of a specific symbol in a row. Consider a random variable m,n that represents a length of a string generated according to the described process. The expectation E[m,n] and variance Var[m,n] depend on m (the size of the alphabet) and n (a parameter that defines a stopping criteria of the string generation process). Also, let's denote the sum of the common path length over all 2-tuples of nodes of G as Sm,n, and let's denote the total number of edges in G as Tm,n. In scope of this paper we prove that the following relations are true for all m,n ≥ 1: E[m,n] = Tm,n and Var[m,n] = (m-1) · Sm,n. While it is known that both E[2,n] and T2,n are described by the sequence A000918 from the On-Line Encyclopedia of Integer Sequences (OEIS), and it is known that S2,n is described by the OEIS sequence A286778, we demonstrate a new interpretation for A286778: this sequence describes Var[2,n] - a variance of the number of tosses of a fair coin until we see n heads in a row.

0

Discussion (0)

Sign in to join the discussion.

Loading comments…