Presentation Information

10:30 AM - 10:45 AM JST(1:30 AM - 1:45 AM UTC)Highlighted Presentation

[22a-52A-5]Benchmark for LLM in Materials Science and the evaluation of ChatGPT and Bard

〇Michiko Yoshitake¹, Yuta Suzuki², Ryo Igarashi¹, Yoshitaka Ushiku¹, Keisuke Nagato³ (1.OSX, 2.Osaka Univ., 3.Univ. Tokyo)

Download PDF

Keywords:

natural language model,materials science,model evaluation

We produced a benchmark data set in materials science for large language models. The benchmark data set is constructied by question-answer problems based on materials science textbooks at university-level. The results of evaluating LLMs, ChatGPT3.5, ChatGPT4 and Bard using this benchmark data set will be presented.

Back to Session information