2025年度 人工知能学会全国大会(第39回)

2025年度 人工知能学会全国大会(第39回)

2025年5月27日〜5月30日大阪国際会議場+オンライン
人工知能学会
2025年度 人工知能学会全国大会(第39回)

2025年度 人工知能学会全国大会(第39回)

2025年5月27日〜5月30日大阪国際会議場+オンライン

[3K4-IS-2a-02]Cross-Lingual Finetuning in Large Language Models

〇Jude McCutcheon1(1. Apprhythm Co., Ltd.)
Large Language Models (LLMs) have set new benchmarks in various fields, achieving higher task performance with smaller training datasets. This success is largely attributed to the pretrain-finetune paradigm: models are first pretrained on extensive unlabeled corpora to develop general language understanding and then fine-tuned on smaller labeled datasets for specific tasks. While effective for many languages and tasks, this approach remains challenging for lower-resource languages, where labeled task data is scarce. Even Japanese, a higher-resource language, is held back by the relative scarcity of task-specific datasets. However, leveraging the wealth of English-language resources through cross-linguistic training offers a promising solution. This study investigates the cross-linguistic generalization capabilities of LLMs by fine-tuning a monolingual English model and its continually pretrained Japanese counterpart on English task datasets and evaluating them on comparable Japanese tasks. Our findings reveal that much of the task-specific knowledge imparted during fine-tuning transcends language boundaries, positing cross-lingual fine-tuning as a powerful strategy for enhancing LLM performance in lower-resource languages.