Presentation Information

[D-12A-09]Evaluation of image caption generation for human behavior using multimodal models with LLM

○Issei Fukunaga1, Muhammad Farhan Maulana1, Kohichi Ogata2 (1. Graduate School of Science and Technology, Kumamoto University, 2. Faculty of Advanced Science and Technology, Kumamoto University)
PDF DownloadDownload PDF

Keywords:

Image Captioning,Large language Models,Multimodal AI