Presentation Information

[4Yin-A-36]Autonomous Emergence of Communication Strategies in World Model-based Multi-Agent Reinforcement Learning

〇Yuya Kamezawa1 (1. The University of Electro-Communications)

Keywords:

World Model,Deep Reinforcement Learning,Communication

Language communication is essential for multi-agent coordination in physical environments. We propose a method where agents autonomously acquire coordination strategies by integrating physical actions and speech using a world model (DreamerV3). Specifically, discrete speech tokens are integrated into the action space as ``active interventions,'' allowing the model to learn ``social causality''---that speech alters others' behavior and contributes to future rewards---through imagination. We designed a ``Treasure Chest'' environment where two physically separated agents must communicate observed hints to open correct boxes. Results show the proposed method achieved a 95\% success rate, whereas the baseline without communication remained at 33\%. Analysis confirmed the emergence of ``symbol grounding,'' linking visual information to speech without explicit supervision. Furthermore, agents acquired a strategy to compensate for imperfect speech with physical exploration. This demonstrates that world models can robustly realize cooperative behavior by complementarily using physical exploration and communication.

Comment

To browse or post comments, you must log in.Log in