Presentation Information
[1Yin-A-14]Multimodal Fashion Retrieval Based on Palette Queries
〇Kanon Amemiya1, Daichi Yashima1, Kei Katsumata1, Komei Sugiura1 (1. Keio University)
Keywords:
Multimodal Retrieval,Multimodal Foundation Model,Nail Design
We focus on the task of retrieving fashion images based on a natural-language description and a palette query.
A palette query is an input in which a user specifies an arbitrary number of colors using a color picker, enabling continuous-valued expression of subtle color nuances.
Although such nuances are crucial in the fashion domain, incorporating continuous color inputs into retrieval has not been sufficiently explored.
To address this gap, we propose a multimodal retrieval method that retrieves fashion images consistent with both the description and the palette query.
Our method explicitly models the relationship between the user intent expressed in the description and the color preferences conveyed by the palette query.
Experiments show that our approach consistently improves standard information-retrieval metrics, demonstrating the effectiveness of leveraging continuous color inputs for fashion image retrieval.
A palette query is an input in which a user specifies an arbitrary number of colors using a color picker, enabling continuous-valued expression of subtle color nuances.
Although such nuances are crucial in the fashion domain, incorporating continuous color inputs into retrieval has not been sufficiently explored.
To address this gap, we propose a multimodal retrieval method that retrieves fashion images consistent with both the description and the palette query.
Our method explicitly models the relationship between the user intent expressed in the description and the color preferences conveyed by the palette query.
Experiments show that our approach consistently improves standard information-retrieval metrics, demonstrating the effectiveness of leveraging continuous color inputs for fashion image retrieval.
