Presentation Information
[D-20-06]An Empirical Study of Pretraining to Enhance Embedding Alignment in Vision-and-Language Models
○Satoshi Kawamura1, Kohei Yamamoto1, Mariko Tomariguchi1, Hideaki Tamai1 (1. OKI)
Keywords:
Vision-and-Language Models,multimodal,alignment