The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[2Yin-B-13]Statistical-Physics Characterization of Long-Context Processing in Language Models

〇Kai Nakaishi¹, Yui Oka², Yuji Yamamoto¹, Kyosuke Nishida², Sho Yokoi¹ (1. NINJAL, 2. NTT)

Keywords:

Large Language Models,Long-Context Processing,Length Extrapolation,Statistical Physics

We provide a statistical-physics characterization of the long-context processing abilities of large language models, including the ability to deal with texts longer than the maximum sequence length seen during training. Specifically, we propose that the ability to appropriately reference arbitrarily distant preceding tokens, even in very long contexts, can be characterized by a power-law decay of two-point correlation with respect to distance in the generated text. We also argue that the ability to suppress repetitions of identical expressions can be characterized by the absence of a dominant period when the generated text is decomposed into waves of different periods. Furthermore, we experimentally examine these claims through statistical analyses of generated texts.

Back to Session information