Song Han
13
Papers
952
Total Citations
Papers (13)
VILA: On Pre-training for Visual Language Models
CVPR 2024
685
citations
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
CVPR 2025
203
citations
WorldModelBench: Judging Video Generation Models As World Models
NeurIPS 2025
31
citations
Condition-Aware Neural Network for Controlled Image Generation
CVPR 2024
17
citations
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
NeurIPS 2025
11
citations
DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer
ICCV 2025arXiv
4
citations
SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference
ICCV 2025
1
citations
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
ICCV 2025
0
citations
DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space
ICCV 2025
0
citations
NVILA: Efficient Frontier Visual Language Models
CVPR 2025
0
citations
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
CVPR 2024
0
citations
Scaling Vision Pre-Training to 4K Resolution
CVPR 2025
0
citations
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
ICML 2024
0
citations