Zilong Huang
8
Papers
90
Total Citations
Papers (8)
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
CVPR 2025
38
citations
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
ICCV 2025
22
citations
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
ICCV 2025
20
citations
Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration
CVPR 2025
8
citations
BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception
NeurIPS 2025arXiv
2
citations
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
CVPR 2024
0
citations
QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing
ICCV 2025
0
citations
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
CVPR 2025
0
citations