Reinterpreting 'the Company a Word Keeps': Towards Explainable and Ontologically Grounded Language Models
Abstract
We argue that the relative success of large language models (LLMs) is not a reflection on the symbolic vs. subsymbolic debate but a reflection on employing a successful bottom-up strategy of a reverse engineering of language at scale. However, and due to their subsymbolic nature whatever knowledge these systems acquire about language will always be buried in millions of weights none of which is meaningful on its own, rendering such systems utterly unexplainable. Furthermore, and due to their stochastic nature, LLMs will often fail in making the correct inferences in various linguistic contexts that require reasoning in intensional, temporal, or modal contexts. To remedy these shortcomings we suggest employing the same successful bottom-up strategy employed in LLMs but in a symbolic setting, resulting in explainable, language-agnostic, and ontologically grounded language models.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.