M2HRI: An LLM-Driven Multimodal Multi-Agent Framework for Personalized Human-Robot Interaction
Abstract
Multi-robot systems hold significant promise for social environments such as homes and hospitals, yet existing multi-robot systems often treat robots as functionally interchangeable, overlooking how distinct agent identities shape user perception and how such individuality changes the coordination requirements of multi-robot interaction. To address this, we introduce M2HRI, a multimodal multi-agent framework that models each robot as an identity-bearing agent through personality and long-term memory, together with a contextualized coordination mechanism that regulates agent participation. In a controlled user study (n = 105) in a multi-agent human-robot interaction (HRI) scenario, we found that most personality contrasts were distinguishable and consistently expressed. Long-term memory improved preference awareness and interaction naturalness, while contextualized coordination improved conversational flow, response appropriateness, and overlap avoidance. Together, these findings show that agent individuality and contextualized participation coordination play complementary roles in supporting coherent and socially appropriate multi-agent HRI. Project website available at https://project-m2hri.github.io/.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.