Presentation Information

[4D1-OS-1-03]LYCO: Learning to Yield in Congested Multi-Agent Path Finding

〇Hiroya Makino1, Koji Hasebe2, Koji Noshiro2, Seigo Ito1 (1. Toyota Central R&D Labs., Inc., 2. University of Tsukuba)

Keywords:

Multi-Agent Path Finding,Multi-Agent Reinforcement Learning,Imitation Learning,Congested Environment

Leveraging its flexibility and scalability, decentralized multi-agent reinforcement learning has been widely applied to multi-agent path finding (MAPF). However, existing methods often suffer from a decrease in success rate as agent density increases. For example, agents that reach their goals early may stay at the goal and block others. To maintain a high success rate even as the number of agents grows, we propose Learning to Yield in Congested MAPF (LYCO). LYCO leverages imitation learning with Priority Inheritance with Backtracking (PIBT), a lightweight and scalable distributed algorithm, as an expert. In addition, LYCO designs observations and rewards to encourage local yielding behaviors. Numerical experiments show that LYCO achieves a higher success rate than existing methods under high-density conditions.