Presentation Information
[5Yin-A-44]Goal-oriented Exploration in Continuous Control Tasks
Shoma Ogawa1, 〇takuma kuroda2, Takumi Kamiya2, Tatsuji Takahasi2, Yu Kohno2 (1. Graduate School of Tokyo Denki University, 2. Tokyo Denki Univercity)
Keywords:
Reinforcement Learning,Machine Learning,Cognitive Science,Continuous Control
In recent years, deep reinforcement learning has achieved strong performance in continuous control tasks. In particular, stochastic policy optimization methods such as Soft Actor-Critic have been widely adopted as a fundamental approach in many continuous control benchmarks due to their learning stability and applicability. However, when available trials or computational resources are limited, balancing exploration efficiency and learning stability remains a challenge. Under such constraints, exploration strategies that select actions satisfying predefined criteria may offer practical advantages. In decision-making theory under bounded rationality, this principle is referred to as satisficing, and several studies report that satisficing-based target-oriented exploration is effective in tasks with discrete action spaces. In this study, we extend satisficing-based target-oriented exploration to continuous control tasks and evaluate it through simulation experiments in the MuJoCo Ant-v5 environment, focusing on exploration efficiency and learning dynamics. The results demonstrate that aspiration-level-based exploration principles also function effectively in continuous control tasks.
