The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[2Yin-A-50]FiT-QA: A VQA Benchmark for Crop Calendars— Dataset Construction and the Limitations of VLMs —

〇Kosuke Takahashi¹, Hayato Aida¹, Kazuki Miyawaki², Sumire Nakagawa², Yasutomo Kimura², Kazuma Kadowaki³, Akio Kobayashi⁴, Masahiro Otomo⁴, Junichi Ishihara⁴, Baba Kenta⁴ (1. Stockmak Inc., 2. Otaru University of Commerce, 3. The Japan Research Institute, 4. National Agriculture and Food Research Organization)

Keywords:

Visual Question Answering,Agriculture,LLM

農業分野では、実務文書QAへのAI活用が十分に進んでいない。中核資料である栽培暦は、表・図・写真・注記と時系列作業が1枚に高密度で混在し、汎用手法の適用が難しい。本研究は、栽培暦画像を対象としたVQAベンチマークFiT-QA（Figures and Tables Question Answering）を提案する。FiT-QAは、自動生成後に人手編集・確認したeasy-QAと、複数領域の統合推論を要するよう人手で一から作成したdifficult-QAで構成され、347画像・1,152QAを収録する。高性能な汎用VLMで評価した結果、easy-QAにも誤答が残り、difficult-QAでは正答が限定的であった。これらは既存技術の直接適用の限界を示しており、FiT-QAを今後の開発・学習に向けた実用的ベンチマークとして公開する。

Back to Session information