The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[5Yin-A-44]Goal-oriented Exploration in Continuous Control Tasks

Shoma Ogawa¹, 〇takuma kuroda², Takumi Kamiya², Tatsuji Takahasi², Yu Kohno² (1. Graduate School of Tokyo Denki University, 2. Tokyo Denki Univercity)

Keywords:

Reinforcement Learning,Machine Learning,Cognitive Science,Continuous Control

In recent years, deep reinforcement learning has achieved strong performance in continuous control tasks. In particular, stochastic policy optimization methods such as Soft Actor-Critic have been widely adopted as a fundamental approach in many continuous control benchmarks due to their learning stability and applicability. However, when available trials or computational resources are limited, balancing exploration efficiency and learning stability remains a challenge. Under such constraints, exploration strategies that select actions satisfying predefined criteria may offer practical advantages. In decision-making theory under bounded rationality, this principle is referred to as satisficing, and several studies report that satisficing-based target-oriented exploration is effective in tasks with discrete action spaces. In this study, we extend satisficing-based target-oriented exploration to continuous control tasks and evaluate it through simulation experiments in the MuJoCo Ant-v5 environment, focusing on exploration efficiency and learning dynamics. The results demonstrate that aspiration-level-based exploration principles also function effectively in continuous control tasks.

Comment

To browse or post comments, you must log in.Log in

Back to Session information