Presentation Information
[2G6-OS-47c-01]Proposal of a Lightweight Probabilistic Model for Multimodal RobotExtension of SARNN using Mixture Density Networks and Energy Score
〇SACHIYA FUJITA1, Hideyuki Ichiwara1, Shigeki Sugano1, Tetsuya Ogata1 (1. Waseda University)
Keywords:
Motion Generation,Imitation Learning,Multimodality,Learning from small datasets
Practical robotic motion generation requires low inference costs and high data efficiency, in addition to addressing multimodality—the existence of multiple valid actions. We propose GMMSARNN, which extends the data-efficient Spatial Attention Point Network (SARNN) to stably solve multimodal tasks. The model employs a Mixture Density Network (MDN) to represent trajectory distributions and uses the Energy Score (multivariate CRPS) instead of Negative Log-Likelihood as a loss function, achieving both multimodal representation and stable training. Evaluations with an 8-DOF manipulator on obstacle avoidance and moving object grasping tasks confirmed the model’s multimodal handling and training stability in the obstacle avoidance task. Furthermore, in the grasping task, GMMSARNN outperformed existing methods such as Diffusion Policy by 15–25% in success rate under unseen conditions, demonstrating superior generalization.
Comment
To browse or post comments, you must log in.Log in
