講演情報

9:45 〜 10:00

[3H1-OS-9a-04]Learning Interpretable Koopman Representations from Video

〇Henrik Krauss¹, Naoya Takeishi¹, Takehisa Yairi¹ (1. The University of Tokyo)

キーワード：

Koopman Operator Learning、Representation Learning、Dynamical Systems、Interpretable Machine Learning

Learning dynamical models from video (i.e., image sequences) is challenging due to high-dimensional visual inputs and limited interpretability of learned representations. Koopman-based methods embed nonlinear dynamics into a lifted, latent space with approximately linear evolution, but the semantic meaning of latent variables is often unclear. In this work, we investigate interpretable Koopman dynamics learning from video using a custom decoder that links latent variables to pixel-accurate attention maps on the decoded image. Through analyses on synthetic video datasets with independently moving objects, we investigate our model's ability to learn object-aligned latent representations in comparison to plain Koopman models. We further analyze the structure of the learned Koopman dynamics induced by the attention-based decoder. Additionally, attention maps enable intuitive visualization of each latent variable’s contribution to image reconstruction.

コメントの閲覧・投稿にはログインが必要です。ログイン

セッション詳細へ戻る