Anurag Arnab
12
Papers
318
Total Citations
Papers (12)
On Scaling Up a Multilingual Vision and Language Model
CVPR 2024
254
citations
VicTR: Video-conditioned Text Representations for Activity Recognition
CVPR 2024
36
citations
Flexible Frame Selection for Efficient Video Reasoning
CVPR 2025
10
citations
Temporal Chain of Thought: Long-Video Understanding by Thinking in Frames
NeurIPS 2025arXiv
8
citations
Dense Video Object Captioning from Disjoint Supervision
ICLR 2025arXiv
7
citations
From Image to Video: An Empirical Study of Diffusion Representations
ICCV 2025
3
citations
End-to-End Spatio-Temporal Action Localisation with Video Transformers
CVPR 2024
0
citations
Pixel-Aligned Language Model
CVPR 2024
0
citations
Time- Memory- and Parameter-Efficient Visual Adaptation
CVPR 2024
0
citations
Principles of Visual Tokens for Efficient Video Understanding
ICCV 2025
0
citations
CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation
CVPR 2024
0
citations
Streaming Dense Video Captioning
CVPR 2024
0
citations