Designing text representations for existing data using the TextFormats Specification Language

Abstract

TextFormats is a software system for efficient and user-friendly creation of text format specifications, accessible from multiple programming languages (C/C++, Python, Nim) and the Unix command line. To work with a format, a specification written in the TextFormats Specification Language (TFSL) must be created. The specification defines datatypes for each part of the format. The syntax for datatype definitions in TextFormats specifications is based on the text representation. Thus this system is well suited for the description of existing formats. However, when creating a new text format for representing existing data, the user may use different possible definitions, based on the type of value and the representation choices. This study explores the possible definition syntax in the TextFormats Specification Language to be used for creating text representations of scalar values (e.g. string, numeric value, boolean) and compound data structures (e.g. array, mapping). The results of the analysis are presented systematically, together with examples for each each type of different values that can be represented, and usage advices.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…