The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

3:00 PM - 3:15 PM JST(6:00 AM - 6:15 AM UTC)

[5L3-OS-6b-05]Adaptive Feature Generation with Multimodal Large Language Models for Predictive Modeling

〇Kosuke Yoshimura¹, Hisashi Kashima¹ (1. Kyoto University)

Keywords:

Feature Generation,Multimodal Large Language Models

In predictive modeling with limited training data and computational resources, generating high-accuracy and interpretable features remains a critical challenge. Particularly in applications requiring high reliability, explainable features are indispensable. Traditionally, human-in-the-loop feature engineering has been employed; however, low throughput due to manual labor often becomes a bottleneck in practical operations.
In this study, we propose a method that adaptively performs feature definition and labeling using Multimodal Large Language Models (MLLMs). By replacing the human roles in the existing AdaFlock framework with MLLMs, our approach achieves significantly faster feature generation compared to manual processes. Specifically, the method dynamically generates a set of interpretable features through MLLM prompting and constructs an ensemble classifier based on these features.
Experimental results across three different modalities demonstrate that our proposed method consistently outperforms direct inference by MLLMs, particularly when using Qwen3. Furthermore, the training processing is completed within a maximum of approximately 37 minutes. This confirms that our method is highly practical for real-world deployment compared to conventional approaches reliant on human resources such as crowdsourcing.

Comment

To browse or post comments, you must log in.Log in

Back to Session information