Presentation Information

[1Yin-B-35]Geometric Analysis of Japanese Embedding Representations in Large Language Models

〇Gento Torimaru1 (1. Waseda University)

Keywords:

LLMs,Contextualized Representations,Anisotropy,Layer-wise Geometric Metrics,Baseline Correction

This study analyzed layer-wise transitions of context specificity and anisotropy in the internal representations of a Japanese large language model (rinna/japanese-gpt-neox-3.6B) using geometric metrics. Using 24,554 sentences constructed from JGLUE JSTS, we measured Self-Similarity, Intra-Sentence Similarity, and MEV, and performed corrected evaluations against an intra-layer random baseline. The results show that corrected Self-Similarity decreases monotonically with depth, indicating increased context specificity in deeper layers. In contrast, Intra-Sentence Similarity and the baseline metrics remain high across all layers, demonstrating that the overall representation space exhibits strong anisotropy (dominated by PC1). These findings reveal that, even in Japanese models, contextual differences are formed on top of a strong common component, and they suggest the practical importance of layer selection and anisotropy mitigation techniques such as centering.