Fahad Shahbaz Khan
17
Papers
328
Total Citations
Papers (17)
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
CVPR 2024
90
citations
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery
CVPR 2024
78
citations
Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors
CVPR 2024
55
citations
Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning
CVPR 2024
34
citations
VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
CVPR 2025
30
citations
Composed Video Retrieval via Enriched Context and Discriminative Embeddings
CVPR 2024
20
citations
Semi-supervised Open-World Object Detection
AAAI 2024arXiv
15
citations
GroupMamba: Efficient Group-Based Visual State Space Model
CVPR 2025
6
citations
VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning
CVPR 2024
0
citations
GeoChat: Grounded Large Vision-Language Model for Remote Sensing
CVPR 2024
0
citations
VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding
CVPR 2024
0
citations
Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning
CVPR 2024
0
citations
GLaMM: Pixel Grounding Large Multimodal Model
CVPR 2024
0
citations
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
CVPR 2025
0
citations
One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
CVPR 2025
0
citations
EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
CVPR 2025
0
citations
S3A: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment
AAAI 2024
0
citations