Presentation Information
[5Yin-A-49]Dataset Construction for Verifying Domain Separability in Internal Representations of Large Language Models
〇Daichi Yano1, hirotoshi taira1 (1. Osaka institute of technology)
Keywords:
Large Language Models (LLMs),Internal Representations,Inter-Domain Separability,Dataset Construction
Recently, Large Language Models (LLMs) have demonstrated exceptional performance not only in general language processing tasks but also in specialized domains such as finance, law, and medicine. However, the mechanisms by which their internal representations retain and process information remain largely unexplored. In particular, it is not yet clear whether LLMs distinguish or separate domains at the internal representation level based on the input prompt. This study aims to investigate the domain separability of internal representations in LLMs by constructing a dataset covering four fields: finance, law, medicine, and computer science. Specifically, we develop a dataset that enables the training of classifiers for domain identification using hidden states obtained from domain-specific prompts. This work establishes a foundation for quantitatively evaluating whether specialized structures corresponding to different domains exist within LLM internal representations.
