Sayan Nag
6
Papers
63
Total Citations
Papers (6)
Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model
CVPR 2024
50
citations
AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs
ICCV 2025
6
citations
MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks
NeurIPS 2025
5
citations
EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception
ICCV 2025
2
citations
AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
ICCV 2025
0
citations
MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
CVPR 2024
0
citations