ECCV
2,387 papers tracked across 1 years
Top Papers in ECCV 2024
View all papers →Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
Shilong Liu, Zhaoyang Zeng, Tianhe Ren et al.
YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao
Adversarial Diffusion Distillation
Axel Sauer, Dominik Lorenz, Andreas Blattmann et al.
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen et al.
Grounding Image Matching in 3D with MASt3R
Vincent Leroy, Yohann Cabon, Jerome Revaud
MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.
CoTracker: It is Better to Track Together
Nikita Karaev, Ignacio Rocco, Ben Graham et al.
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Nanye Ma, Mark Goldstein, Michael Albergo et al.
MobileNetV4: Universal Models for the Mobile Ecosystem
Danfeng Qin, Chas Leichner, Manolis Delakis et al.
VideoMamba: State Space Model for Efficient Video Understanding
Kunchang Li, Xinhao Li, Yi Wang et al.
MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images
Yuedong Chen, Haofei Xu, Chuanxia Zheng et al.
Evaluating Text-to-Visual Generation with Image-to-Text Generation
Zhiqiu Lin, Deepak Pathak, Baiqi Li et al.
An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Liang Chen, Haozhe Zhao, Tianyu Liu et al.
SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion
Vikram Voleti, Chun-Han Yao, Mark Boss et al.
BLINK: Multimodal Large Language Models Can See but Not Perceive
Xingyu Fu, Yushi Hu, Bangzheng Li et al.
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting
Zehao Zhu, Zhiwen Fan, Yifan Jiang et al.
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang et al.
DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior
Xinqi Lin, Jingwen He, Ziyan Chen et al.
Photorealistic Video Generation with Diffusion Models
Agrim Gupta, Lijun Yu, Kihyuk Sohn et al.
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu, Zifan Shi, Wang Yifan et al.