Session Details

[5M1-GS-2b]Machine learning

Fri. Jun 12, 2026 9:00 AM - 10:30 AM JST
Fri. Jun 12, 2026 12:00 AM - 1:30 AM UTC
Room M(Middle room 302A)

[5M1-GS-2b-01]AlphaZeRS: Efficient Decision-Making with Limited Computational Resources

Takumi Watanabe1, 〇Suguru Takauchi1, Yu Kamata2, Ryoji Sakuraoka2, Yu Kohno1, Tatsuji Takahashi1 (1. School of Science and Engineering, Tokyo Denki University, 2. Graduate School of Tokyo Denki University)

[5M1-GS-2b-02]Sample-Efficient Reinforcement Learning through Cross Bisimulation-Based Implicit Imitation Learning

〇Takahisa Imagawa1, Shuichi Enokida1 (1. Kyushu Institute of Technology)

[5M1-GS-2b-03]Reinforcement Learning Model for Cash-in-Transit Rebalancing Problem Considering Reuse of Collected Cash

〇Ryoga Miyajima1, Ai Kondoh1, Hideaki Tamai1 (1. Oki Electric Industry Co., Ltd.)

[5M1-GS-2b-04]Model-Based Reinforcement Learning for HVAC Control via a GNN-Based Latent Space

〇Teruaki Hasegawa1, Ziwei XU1, Ryutaro Ichise1 (1. Institute of Science Tokyo)

[5M1-GS-2b-05]Exhaustive Enumeration of Policies in Markov Decision Processes Using SeqBDDs

〇Kotaro Ishihara1, Kazuma Fuchimoto2, Maomi Ueno1 (1. The University of Electro-Communications, 2. The National Center for University Entrance Examinations)

[5M1-GS-2b-06]Evaluating SFT-free RL Post-training with GRPO for Japanese Large Language Models日本語LLMに対するR1-Zero likeな事後学習手法の多目的評価

〇Naoya Tsuji1 (1. KADOKAWA DWANGO Educational Institute S High school)