ECCV 2024 "multimodal datasets" Papers
2 papers found
Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment
Brian Gordon, Yonatan Bitton, Yonatan Shafir et al.
ECCV 2024posterarXiv:2312.03766
17
citations
When Pedestrian Detection Meets Multi-Modal Learning: Generalist Model and Benchmark Dataset
Yi Zhang, Wang Zeng, Sheng Jin et al.
ECCV 2024posterarXiv:2407.10125
19
citations