"visual-textual alignment" Papers
3 papers found
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
Wenqi Zhang, Hang Zhang, Xin Li et al.
ICCV 2025highlightarXiv:2501.00958
5
citations
AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?
Shouwei Ruan, Hanqing Liu, Yao Huang et al.
ICCV 2025highlightarXiv:2412.03002
2
citations
Anomize: Better Open Vocabulary Video Anomaly Detection
Fei Li, Wenxuan Liu, Jingjing Chen et al.
CVPR 2025posterarXiv:2503.18094
4
citations