Fahad Shahbaz Khan

17
Papers
328
Total Citations

Papers (17)

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

CVPR 2024
90
citations

Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery

CVPR 2024
78
citations

Self-Distilled Masked Auto-Encoders are Efficient Video Anomaly Detectors

CVPR 2024
55
citations

Progressive Semantic-Guided Vision Transformer for Zero-Shot Learning

CVPR 2024
34
citations

VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

CVPR 2025
30
citations

Composed Video Retrieval via Enriched Context and Discriminative Embeddings

CVPR 2024
20
citations

Semi-supervised Open-World Object Detection

AAAI 2024arXiv
15
citations

GroupMamba: Efficient Group-Based Visual State Space Model

CVPR 2025
6
citations

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

CVPR 2024
0
citations

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

CVPR 2024
0
citations

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

CVPR 2024
0
citations

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning

CVPR 2024
0
citations

GLaMM: Pixel Grounding Large Multimodal Model

CVPR 2024
0
citations

All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

CVPR 2025
0
citations

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

CVPR 2025
0
citations

EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues

CVPR 2025
0
citations

S3A: Towards Realistic Zero-Shot Classification via Self Structural Semantic Alignment

AAAI 2024
0
citations