The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-B-35]Geometric Analysis of Japanese Embedding Representations in Large Language Models

〇Gento Torimaru¹ (1. Waseda University)

Keywords:

LLMs,Contextualized Representations,Anisotropy,Layer-wise Geometric Metrics,Baseline Correction

This study analyzed layer-wise transitions of context specificity and anisotropy in the internal representations of a Japanese large language model (rinna/japanese-gpt-neox-3.6B) using geometric metrics. Using 24,554 sentences constructed from JGLUE JSTS, we measured Self-Similarity, Intra-Sentence Similarity, and MEV, and performed corrected evaluations against an intra-layer random baseline. The results show that corrected Self-Similarity decreases monotonically with depth, indicating increased context specificity in deeper layers. In contrast, Intra-Sentence Similarity and the baseline metrics remain high across all layers, demonstrating that the overall representation space exhibits strong anisotropy (dominated by PC1). These findings reveal that, even in Japanese models, contextual differences are formed on top of a strong common component, and they suggest the practical importance of layer selection and anisotropy mitigation techniques such as centering.

Comment

To browse or post comments, you must log in.Log in

Back to Session information