Shangzhe Di
5
Papers
105
Total Citations
Papers (5)
Grounded Question-Answering in Long Egocentric Videos
CVPR 2024
46
citations
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
AAAI 2025
25
citations
Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation
CVPR 2025
22
citations
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
NeurIPS 2025arXiv
9
citations
Learning Streaming Video Representation via Multitask Training
ICCV 2025
3
citations