Xiaohan Wang
11
Papers
77
Total Citations
Papers (11)
Describing Differences in Image Sets with Natural Language
CVPR 2024
51
citations
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
CVPR 2025arXiv
21
citations
Cross-Sentence Gloss Consistency for Continuous Sign Language Recognition
AAAI 2024
5
citations
Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration
ICCV 2025
0
citations
DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
AAAI 2024arXiv
0
citations
Apollo: An Exploration of Video Understanding in Large Multimodal Models
CVPR 2025
0
citations
Interpretable3D: An Ad
AAAI 2024
0
citations
A Category Agnostic Model for Visual Rearrangment
CVPR 2024
0
citations
Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation
CVPR 2024
0
citations
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature
CVPR 2025
0
citations
An Interactive Navigation Method with Effect-oriented Affordance
CVPR 2024
0
citations