FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval
Published in AAAI, 2024
TL;DR: This paper defines and analyzes the common phenomenon of “visual dominance” in the composed image retrieval task, where retrieval results are dominated by reference images and overlook the modification text. To mitigate this “visual dominance”, we propose a Fashion Enhance-and-Refine Network (FashionERN) that enhances text semantics and filters visual semantics.
Recommended citation: Yanzhe Chen, Huasong Zhong, Xiangteng He, Yuxin Peng, Jiahuan Zhou and Lele Cheng, "FashionERN: Enhance-and-Refine Network for Composed Fashion Image Retrieval", AAAI 2024.
Download Paper