Presentation Information

[4Yin-A-36]Autonomous Emergence of Communication Strategies in World Model-based Multi-Agent Reinforcement Learning

〇Yuya Kamezawa1 (1. The University of Electro-Communications)

Keywords:

World Model,Deep Reinforcement Learning,Communication

Language communication is essential for multi-agent coordination in physical environments. We propose a method where agents autonomously acquire coordination strategies by integrating physical actions and speech using a world model (DreamerV3). Specifically, discrete speech tokens are integrated into the action space as ``active interventions,'' allowing the model to learn ``social causality''---that speech alters others' behavior and contributes to future rewards---through imagination. We designed a ``Treasure Chest'' environment where two physically separated agents must communicate observed hints to open correct boxes. Results show the proposed method achieved a 95\% success rate, whereas the baseline without communication remained at 33\%. Analysis confirmed the emergence of ``symbol grounding,'' linking visual information to speech without explicit supervision. Furthermore, agents acquired a strategy to compensate for imperfect speech with physical exploration. This demonstrates that world models can robustly realize cooperative behavior by complementarily using physical exploration and communication.