The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-A-14]Multimodal Fashion Retrieval Based on Palette Queries

〇Kanon Amemiya¹, Daichi Yashima¹, Kei Katsumata¹, Komei Sugiura¹ (1. Keio University)

Keywords:

Multimodal Retrieval,Multimodal Foundation Model,Nail Design

We focus on the task of retrieving fashion images based on a natural-language description and a palette query.
A palette query is an input in which a user specifies an arbitrary number of colors using a color picker, enabling continuous-valued expression of subtle color nuances.
Although such nuances are crucial in the fashion domain, incorporating continuous color inputs into retrieval has not been sufficiently explored.
To address this gap, we propose a multimodal retrieval method that retrieves fashion images consistent with both the description and the palette query.
Our method explicitly models the relationship between the user intent expressed in the description and the color preferences conveyed by the palette query.
Experiments show that our approach consistently improves standard information-retrieval metrics, demonstrating the effectiveness of leveraging continuous color inputs for fashion image retrieval.

Comment

To browse or post comments, you must log in.Log in

Back to Session information