Kai Han
27
Papers
101
Total Citations
Papers (27)
PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
ECCV 2024
18
citations
AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation
ICLR 2025
18
citations
Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts
AAAI 2025
13
citations
Mr. DETR: Instructive Multi-Route Training for Detection Transformers
CVPR 2025
12
citations
Data-efficient Large Vision Models through Sequential Autoregression
ICML 2024
12
citations
Hyperbolic Category Discovery
CVPR 2025
7
citations
Adapt without Forgetting: Distill Proximity from Dual Teachers in Vision-Language Models
ECCV 2024
6
citations
SpeCache: Speculative Key-Value Caching for Efficient Generation of LLMs
ICML 2025
4
citations
GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models
ICCV 2025
3
citations
Parallel Sequence Modeling via Generalized Spatial Propagation Network
CVPR 2025arXiv
3
citations
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
CVPR 2025arXiv
2
citations
VaMP: Variational Multi-Modal Prompt Learning for Vision-Language Models
NeurIPS 2025
1
citations
SEAL: Semantic-Aware Hierarchical Learning for Generalized Category Discovery
NeurIPS 2025
1
citations
LLM Data Selection and Utilization via Dynamic Bi-level Optimization
ICML 2025
1
citations
Rethinking Optimization and Architecture for Tiny Language Models
ICML 2024
0
citations
Detecting Open World Objects via Partial Attribute Assignment
CVPR 2025
0
citations
Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping
ICCV 2025
0
citations
L-Man: A Large Multi-modal Model Unifying Human-centric Tasks
AAAI 2025
0
citations
Deletion-Robust Submodular Maximization with Knapsack Constraints
AAAI 2024
0
citations
SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching
CVPR 2024
0
citations
IBD-SLAM: Learning Image-Based Depth Fusion for Generalizable SLAM
CVPR 2024
0
citations
An Empirical Study of Scaling Law for Scene Text Recognition
CVPR 2024
0
citations
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models
CVPR 2024
0
citations
ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks
CVPR 2024
0
citations
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer
ICML 2024
0
citations
Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
ICML 2024
0
citations
ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models
CVPR 2025
0
citations