Arsha Nagrani
9
Papers
337
Total Citations
Papers (9)
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
254
citations
VicTR: Video-conditioned Text Representations for Activity Recognition
CVPR 2024
36
citations
AutoAD III: The Prequel – Back to the Pixels
CVPR 2024
33
citations
Flexible Frame Selection for Efficient Video Reasoning
CVPR 2025
10
citations
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
ICCV 2025
3
citations
Unbiasing through Textual Descriptions: Mitigating Representation Bias in Video Benchmarks
CVPR 2025
1
citations
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering
CVPR 2024
0
citations
Streaming Dense Video Captioning
CVPR 2024
0
citations
MINERVA: Evaluating Complex Video Reasoning
ICCV 2025
0
citations