From Transformer to Biology: A Hierarchical Model for Attention in Complex Problem-Solving
Abstract
Attention is fundamental to cognition, yet it remains a challenge to understand attention in tasks approaching real-world complexity. Here, we approached this problem by modeling gaze patterns of monkeys playing Pac-Man. We first show a transformer network trained to reproduce their gameplay developed internal attention patterns closely matching the monkeys' eye movements. By dissecting the network's attention, we revealed a hierarchical structure comprising two components: a value-based layer encoding fixed object salience, coupled with a dynamic interaction layer tracking relational information between game elements. We further developed a condensed model in which reward-driven attention serves as a gain modulator and is integrated with spatial attention maps, predicting attention as well as the transformer. Together, our study pioneers the use of AI architectures as analytical tools and bridges mechanistic interpretability with cognitive neuroscience to yield novel, testable insights into how the brain coordinates reward, spatial cognition, and attention in complex environments.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.