Session Details
[5M1-GS-2b]Machine learning
Fri. Jun 12, 2026 9:00 AM - 10:30 AM JST
Fri. Jun 12, 2026 12:00 AM - 1:30 AM UTC
Fri. Jun 12, 2026 12:00 AM - 1:30 AM UTC
Room M(Middle room 302A)
[5M1-GS-2b-01]AlphaZeRS: Efficient Decision-Making with Limited Computational Resources
Takumi Watanabe1, 〇Suguru Takauchi1, Yu Kamata2, Ryoji Sakuraoka2, Yu Kohno1, Tatsuji Takahashi1 (1. School of Science and Engineering, Tokyo Denki University, 2. Graduate School of Tokyo Denki University)
[5M1-GS-2b-02]Sample-Efficient Reinforcement Learning through Cross Bisimulation-Based Implicit Imitation Learning
〇Takahisa Imagawa1, Shuichi Enokida1 (1. Kyushu Institute of Technology)
[5M1-GS-2b-03]Reinforcement Learning Model for Cash-in-Transit Rebalancing Problem Considering Reuse of Collected Cash
〇Ryoga Miyajima1, Ai Kondoh1, Hideaki Tamai1 (1. Oki Electric Industry Co., Ltd.)
[5M1-GS-2b-04]Model-Based Reinforcement Learning for HVAC Control via a GNN-Based Latent Space
〇Teruaki Hasegawa1, Ziwei XU1, Ryutaro Ichise1 (1. Institute of Science Tokyo)
[5M1-GS-2b-05]Exhaustive Enumeration of Policies in Markov Decision Processes Using SeqBDDs
〇Kotaro Ishihara1, Kazuma Fuchimoto2, Maomi Ueno1 (1. The University of Electro-Communications, 2. The National Center for University Entrance Examinations)
[5M1-GS-2b-06]Evaluating SFT-free RL Post-training with GRPO for Japanese Large Language Models日本語LLMに対するR1-Zero likeな事後学習手法の多目的評価
〇Naoya Tsuji1 (1. KADOKAWA DWANGO Educational Institute S High school)
