Zilong Huang
16
Papers
80
Total Citations
Papers (16)
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
CVPR 2025
38
citations
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation
ICCV 2025
22
citations
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer
ICCV 2025
20
citations
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
CVPR 2024
0
citations
Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing
CVPR 2018
0
citations
Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis
CVPR 2020
0
citations
Human De-Occlusion: Invisible Perception and Recovery for Humans
CVPR 2021
0
citations
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
CVPR 2022arXiv
0
citations
Executing Your Commands via Motion Diffusion in Latent Space
CVPR 2023arXiv
0
citations
Object-Level Proposals
ICCV 2017
0
citations
CCNet: Criss-Cross Attention for Semantic Segmentation
ICCV 2019
0
citations
SPGNet: Semantic Prediction Guidance for Scene Parsing
ICCV 2019
0
citations
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos
CVPR 2025
0
citations
Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration
CVPR 2025
0
citations
QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing
ICCV 2025
0
citations
Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations
NeurIPS 2022
0
citations