Searching for Synergy in Shared Workspace Human-AI Collaboration
Abstract
Automated AI agents are increasingly capable, yet many scientific and professional tasks require human judgment and contextual expertise. We use simulated shared-workspace human-AI teams as a controlled testbed for studying how collaboration structure shapes team behavior. Using the Collaborative Gym environment with tasks from DiscoveryBench, we vary team compositions and collaboration structures across 1,482 sessions. We find that adding additional collaborators can lower performance when coordination structure is absent. We then evaluate collaboration scaffolding that combines shared group memory with simulated human-in-the-loop (HITL) gates, where selected actions require approval from a designated simulated participant. This scaffolding improves performance, most clearly in three-person teams, with clearer responsibility signals and stronger routing of expertise to team actions. Overall, our results suggest that coordination structure is central to whether available capability improves team outcomes.
Turn this paper into a lesson
ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.