Beyond the Click: A Framework for Inferring Cognitive Traces in Search

Abstract

User simulators are essential for evaluating search systems, but they primarily reproduce user actions without modeling the underlying thought process. Large-scale interaction logs record what users do, but not what they might be thinking or feeling, such as confusion or satisfaction. We present a framework for inferring cognitive traces from behavioral logs. Our method uses a multi-agent LLM system grounded in Information Foraging Theory (IFT) and validated by human experts. We annotate three public datasets (AOL, Stack Overflow, and MovieLens), producing over 530,000 cognitive labels across 50,000 sessions. A cross-dataset evaluation with a shuffled-label control reveals that cognitive labels provide the strongest signal where behavioral features are weakest: on MovieLens, the cognitive model improves F1 by up to 6.6% over the behavioral baseline and 1.8% above the shuffled control, while on AOL, where click patterns are highly predictive, improvements are near zero. We release the annotation collection on HuggingFace, an open-source annotation tool, and all experimental code to support future work on cognitively aware user simulation.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…