Presentation Information

[1Yin-A-57]Analyzing Numerical Sequence Understanding in Large Language Models: A Hidden Vector Analysis Focusing on Maximum Values

〇Mizuki Arai1,2, Tatsuya Ishigaki2, Yusuke Miyao3,2, Hiroya Takamura2, Ichiro Kobayashi1,2 (1. Ochanomizu Univ., 2. Artificial Intelligence Research Center, 3. Univ. of Tokyo)

Keywords:

Interpretability,Large Language Models,Numerical Sequence

Large language models (LLMs) have demonstrated strong performance on a wide range of tasks involving numerical information; however, how they internally understand and process numerical sequences remains largely unexplored.
This study aims to elucidate the behavior of numerical sequence understanding in LLMs by focusing on maximum values, a fundamental element of numerical reasoning, and analyzing how such information is represented in the models’ hidden representations.
Specifically, we input integer sequences into an LLM and examine token-wise changes in the key, query, and value vectors within the attention mechanism.
Our analysis reveals that these vector representations exhibit pronounced changes at positions where the maximum value in the sequence is updated. Notably, this behavior is observed even when no explicit instruction to identify the maximum value is provided, and the changes become more pronounced when an explicit maximum-finding instruction is given.
These results suggest that LLMs may internally maintain and update information about maximum values in numerical sequences, even in the absence of explicit task instructions.