Presentation Information

[22a-52A-8]Development of semi-automatic AI system for large-scale data curation in Starrydata

〇Tomoya Mato1, Masaya Kumagai2,3, Yu Takada1, Yukari Katsura1,2,4 (1.NIMS, 2.RIKEN, 3.SAKURA Internet Inc., 4.University of Tsukuba)

Keywords:

Materials Informatics,Generative AI,GPT

Starrydata collects data from plot images in academic papers and has amassed data from 74,000 samples and 180,000 curves across 12,400 papers to date. To enhance the efficiency of automatic data extraction, we added spline interpolation functionality to StarryDigitizer and integrated ChatGPT into Starrydata2 web system. Additionally, we have developed Starrydata Visualizer and a data analysis platform to effectively improve data cleansing processes.