Mubarak Shah

19
Papers
161
Total Citations

Papers (19)

Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors

CVPR 2024
55
citations

M-LLM Based Video Frame Selection for Efficient Video Understanding

CVPR 2025
46
citations

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

CVPR 2024
20
citations

X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs

ECCV 2024
9
citations

VidLA: Video-Language Alignment at Scale

CVPR 2024
8
citations

Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models

CVPR 2025
6
citations

FinePseudo: Improving Pseudo-Labelling through Temporal-Alignablity for Semi-Supervised Fine-Grained Action Recognition

ECCV 2024arXiv
5
citations

Open Vocabulary Multi-Label Video Classification

ECCV 2024arXiv
5
citations

ALBAR: Adversarial Learning approach to mitigate Biases in Action Recognition

ICLR 2025
3
citations

GT-Loc: Unifying When and Where in Images through a Joint Embedding Space

ICCV 2025
2
citations

Beyond Simple Edits: Composed Video Retrieval with Dense Modifications

ICCV 2025
1
citations

Möbius Transform for Mitigating Perspective Distortions in Representation Learning

ECCV 2024
1
citations

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

ICCV 2025
0
citations

CoLLM: A Large Language Model for Composed Image Retrieval

CVPR 2025
0
citations

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

AAAI 2024arXiv
0
citations

Test-Time Retrieval-Augmented Adaptation for Vision-Language Models

ICCV 2025
0
citations

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

CVPR 2025
0
citations

Multiview Aerial Visual RECognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

CVPR 2024
0
citations

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

CVPR 2025
0
citations