Empathy Modeling in Active Inference Agents for Perspective-Taking and Alignment

Abstract

Artificial agents capable of understanding and aligning with others' intentions are essential for safe and socially robust artificial intelligence. We introduce a computational framework for empathy in active inference agents, grounded in explicit perspective-taking via self-other model transformation. We instantiate this framework in a multi-agent Iterated Prisoner's Dilemma and show that empathic perspective-taking induces robust cooperation without explicit communication or reward shaping. Cooperation emerges only when empathy is reciprocated, while asymmetric empathy leads to systematic exploitation. Beyond equilibrium outcomes, empathic agents exhibit synchronized behavior, rapid recovery from stochastic defections, and joint intentional dynamics resembling apology-forgiveness cycles. Near empathy symmetry, interactions display long transients and elevated variance, consistent with critical dynamics near regime boundaries. We further examine a learning-enabled variant in which agents infer opponent type via Bayesian updating. While opponent models converge rapidly, long-run cooperation remains primarily determined by the empathy parameter, indicating that cooperation is driven by empathic structure rather than learned reciprocity. Empathy functions as a structural prior over social interaction, shaping coordination stability, robustness, and temporal dynamics. The proposed framework highlights active inference as a principled foundation for socially aligned artificial agents that coordinate through internal simulation rather than behavioral mimicry.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…