The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-A-09]Hallucination Detection and Editing in MLLM

〇Yuiga Wada^1,2,3, Kazuki Matsuda¹, Graham Neubig³, Komei Sugiura^1,2 (1. Keio University, 2. Keio AI Research Center, 3. Carnegie Mellon University)

Keywords:

Hallucination,Multimodal Large Language Model,Image Captioning

マルチモーダル大規模言語モデル（MLLM）はしばしばハルシネーションを含む文を生成する．ハルシネーションはモデルの実応用における信頼性を損なうため，MLLM開発にはハルシネーションに関する評価および分析が不可欠である．本研究ではMLLMのハルシネーション検出および編集を目的とする新たなタスク"multimodal fine-grained hallucination detection and editing"を提案する．また，ハルシネーションに該当するスパンを６つのエラー種類に基づいて特定し，適切な修正テキストを出力するモデルZINAを提案する．さらに，モデルの学習および評価のため，新たにVisionHallデータセットを構築した．VisionHallにおける実験の結果，提案手法はGPT-4oおよびLlama-3.2を含むベースライン手法を大きく上回った．

Comment

To browse or post comments, you must log in.Log in

Back to Session information