2025 "spatio-temporal grounding" Papers
2 papers found
Large-scale Pre-training for Grounded Video Caption Generation
Evangelos Kazakos, Cordelia Schmid, Josef Sivic
ICCV 2025posterarXiv:2503.10781
3
citations
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi et al.
NEURIPS 2025oralarXiv:2504.13180
40
citations