On the Possibility of Rewarding Structure Learning Agents: Mutual Information on Linguistic Random Sets
Abstract
We present a first attempt to elucidate a theoretical and empirical approach to design the reward provided by a natural language environment to some structure learning agent. To this end, we revisit the Information Theory of unsupervised induction of phrase-structure grammars to characterize the behavior of simulated actions modeled as set-valued random variables (random sets of linguistic samples) constituting semantic structures. Our results showed empirical evidence of that simulated semantic structures (Open Information Extraction triplets) can be distinguished from randomly constructed ones by observing the Mutual Information among their constituents. This suggests the possibility of rewarding structure learning agents without using pretrained structural analyzers (oracle actors/experts).
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.