Non-erasing Chomsky-Sch\"utzenberger theorem with grammar-independent alphabet

Abstract

The famous theorem by Chomsky and Sch\"utzenberger (CST) says that every context-free language L over an alphabet is representable as h(D R), where D is a Dyck language over a set of brackets, R is a local language and h is an alphabetic homomorphism that erases unboundedly many symbols. Berstel found that the number of erasures can be linearly limited if the grammar is in Greibach normal form; Berstel and Boasson (and later, independently, Okhotin) proved a non-erasing variant of CST for grammars in Double Greibach Normal Form. In all these CST statements, however, the size of the Dyck alphabet depends on the grammar size for L. In the Stanley variant of the CST, || only depends on || and not on the grammar, but the homomorphism erases many more symbols than in the other versions of CST; also, the regular language R is strictly locally testable but not local. We prove a new version of CST which combines both features of being non-erasing and of using a grammar-independent alphabet. In our construction, || is polynomial in ||, namely O(||46), and the regular language R is strictly locally testable. Using a recent generalization of Medvedev's homomorphic characterization of regular languages, we prove that the degree in the polynomial dependence of || on || may be reduced to just 2 in the case of linear grammars in Double Greibach Normal Form.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…