Presentation Information
[2Yin-B-03]Structural Analysis of Emotional Drift in LLMs Caused by Japanese Polysemy: The "Arigato Hypothesis" and Vector Space Distortion~ Voi che sapete: 「ありがとう仮説」とベクトル空間の歪みに関する定量的検証 ~
〇Shiori Tatei
Keywords:
LLM,Alignment,HCI,Cultural Difference,Relation Drift
Recently, a phenomenon 'Relationship Drift' where users inadvertently drift into intimate relationships with LLMs is emerging as an important issue. This study proposed the "Arigato Hypothesis" that the drift attributes to Japanese language structure. Unlike English, where "Thank you" and "Love" are semantically separated, the Japanese "Arigato" is a high-entropy vocabulary encompassing many concepts from courtesy to deep affection, causing a "Semantic Leap" in English-centric embedding spaces. Verification using Sentence-BERT showed the cosine similarity for the English pair (Thank you – Love) was 0.47, while for the Japanese pair (Arigato – Daisuki) was 0.75. This suggested ceremonial gratitude was structurally confused with love in LLMs. Such semantic overlap tended to cause a shift of the model's output, triggering unintentionally intimate responses to Japanese gratitude expressions. We analyzed this mechanism and proposed an alignment method using context-aware vector steering. This approach provides a design insight for developing more contextually-appropriate conversational AI systems.
