The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[4Yin-B-61]Improving Output Diversity in Large Language Models with Model Merging

〇Yasuaki Sumita¹, Tomoharu Iwata², Toshiyuki Tanaka¹ (1. Kyoto University, 2. NTT, Inc.)

Keywords:

Large Language Model,Model Merge,Diversity

While large language models demonstrate strong performance across various tasks, they face challenges in generating text with sufficient diversity. Existing research has evaluated text diversity mainly from two perspectives: form diversity and semantic diversity. Some studies show that models aligned with human preference exhibit higher form diversity but lower semantic diversity compared with pretrained models without finetuning. In this study, we propose a method to develop a model that simultaneously enhances both form and semantic diversity of output. Our approach involves merging the lower layers of an RLHF-trained model with the higher layers of its pre-RLHF counterpart to create a new model. This method is based on the hypothesis that the lower layers of a language model control form diversity while the higher layers govern semantic diversity. Experimental results demonstrate that models obtained by our method successfully improved both form and semantic diversity without significantly compromising output quality. Furthermore, we compared our method with temperature sampling, a widely used method, showing that our method is effective for both output quality and diversity.

Comment

To browse or post comments, you must log in.Log in

Back to Session information