ICLR 2025 "video captioning" Papers
3 papers found
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang, Jiayan Teng, Wendi Zheng et al.
ICLR 2025oralarXiv:2408.06072
1355
citations
Modeling dynamic social vision highlights gaps between deep learning and humans
Kathy Garcia, Emalie McMahon, Colin Conwell et al.
ICLR 2025poster
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Mingfei Han, Linjie Yang, Xiaojun Chang et al.
ICLR 2025posterarXiv:2312.10300
46
citations