Weidi Xie
18
Papers
193
Total Citations
Papers (18)
Grounded Question-Answering in Long Egocentric Videos
CVPR 2024
46
citations
AutoAD III: The Prequel – Back to the Pixels
CVPR 2024
33
citations
Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos
AAAI 2025
25
citations
Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation
CVPR 2025
22
citations
Track-On: Transformer-based Online Point Tracking with Memory
ICLR 2025
16
citations
Towards Universal Soccer Video Understanding
CVPR 2025
14
citations
Multi-Sentence Grounding for Long-term Instructional Video
ECCV 2024
12
citations
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning
ICLR 2025
11
citations
Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
ECCV 2024arXiv
8
citations
Learning Streaming Video Representation via Multitask Training
ICCV 2025
3
citations
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
ICCV 2025
3
citations
LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant
CVPR 2025
0
citations
Retrieval-Augmented Egocentric Video Captioning
CVPR 2024
0
citations
Amodal Ground Truth and Completion in the Wild
CVPR 2024
0
citations
Object-centric Video Question Answering with Visual Grounding and Referring
ICCV 2025
0
citations
Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models
CVPR 2024
0
citations
InstaGen: Enhancing Object Detection by Training on Synthetic Dataset
CVPR 2024
0
citations
MRGen: Segmentation Data Engine For Underrepresented MRI Modalities
ICCV 2025
0
citations