Recursive Semantic Anchoring in ISO 639:2023: A Structural Extension to ISO/TC 37 Frameworks
Abstract
ISO 639:2023 unifies the ISO language-code family and introduces contextual metadata, but it lacks a machine-native mechanism for handling dialectal drift and creole mixtures. We propose a formalisation of recursive semantic anchoring, attaching to every language entity a family of fixed-point operators φn,m that model bounded semantic drift via the relation φn,m() = (), where () is a drift vector in a latent semantic manifold. The base anchor φ0,0 recovers the canonical ISO 639:2023 identity, whereas φ99,9 marks the maximal drift state that triggers a deterministic fallback. Using category theory, we treat the operators φn,m as morphisms and drift vectors as arrows in a category DriftLang. A functor : DriftLang AnchorLang maps every drifted object to its unique anchor and proves convergence. We provide an RDF/Turtle schema (BaseLanguage, DriftedLanguage, ResolvedAnchor) and worked examples -- e.g., φ8,4 (Standard Mandarin) versus φ8,7 (a colloquial variant), and φ1,7 for Nigerian Pidgin anchored to English. Experiments with transformer models show higher accuracy in language identification and translation on noisy or code-switched input when the φ-indices are used to guide fallback routing. The framework is compatible with ISO/TC 37 and provides an AI-tractable, drift-aware semantic layer for future standards.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.