Most Cited ECCV Spotlight "pathology severity control" Papers

2,387 papers found • Page 1 of 12

Filters:Most Cited ECCV spotlight pathology severity control Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

Shilong Liu, Zhaoyang Zeng, Tianhe Ren et al.

ECCV 2024posterarXiv:2303.05499

3368

citations

YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Chien-Yao Wang, I-Hau Yeh, Hong-Yuan Mark Liao

ECCV 2024posterarXiv:2402.13616

2952

citations

Adversarial Diffusion Distillation

Axel Sauer, Dominik Lorenz, Andreas Blattmann et al.

ECCV 2024posterarXiv:2311.17042

617

citations

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen et al.

ECCV 2024posterarXiv:2402.05054

616

citations

Grounding Image Matching in 3D with MASt3R

Vincent Leroy, Yohann Cabon, Jerome Revaud

ECCV 2024posterarXiv:2406.09756

512

citations

MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Renrui Zhang, Dongzhi Jiang, Yichi Zhang et al.

ECCV 2024posterarXiv:2403.14624

473

citations

CoTracker: It is Better to Track Together

Nikita Karaev, Ignacio Rocco, Ben Graham et al.

ECCV 2024posterarXiv:2307.07635

450

citations

SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers

Nanye Ma, Mark Goldstein, Michael Albergo et al.

ECCV 2024posterarXiv:2401.08740

428

citations

MobileNetV4: Universal Models for the Mobile Ecosystem

Danfeng Qin, Chas Leichner, Manolis Delakis et al.

ECCV 2024posterarXiv:2404.10518

407

citations

#10

VideoMamba: State Space Model for Efficient Video Understanding

Kunchang Li, Xinhao Li, Yi Wang et al.

ECCV 2024posterarXiv:2403.06977

401

citations

#11

MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images

Yuedong Chen, Haofei Xu, Chuanxia Zheng et al.

ECCV 2024posterarXiv:2403.14627

356

citations

#12

Evaluating Text-to-Visual Generation with Image-to-Text Generation

Zhiqiu Lin, Deepak Pathak, Baiqi Li et al.

ECCV 2024posterarXiv:2404.01291

347

citations

#13

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Liang Chen, Haozhe Zhao, Tianyu Liu et al.

ECCV 2024posterarXiv:2403.06764

343

citations

#14

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Vikram Voleti, Chun-Han Yao, Mark Boss et al.

ECCV 2024posterarXiv:2403.12008

318

citations

#15

BLINK: Multimodal Large Language Models Can See but Not Perceive

Xingyu Fu, Yushi Hu, Bangzheng Li et al.

ECCV 2024posterarXiv:2404.12390

307

citations

#16

FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting

Zehao Zhu, Zhiwen Fan, Yifan Jiang et al.

ECCV 2024posterarXiv:2312.00451

293

citations

#17

PointLLM: Empowering Large Language Models to Understand Point Clouds

Runsen Xu, Xiaolong Wang, Tai Wang et al.

ECCV 2024posterarXiv:2308.16911

289

citations

#18

DiffBIR: Toward Blind Image Restoration with Generative Diffusion Prior

Xinqi Lin, Jingwen He, Ziyan Chen et al.

ECCV 2024posterarXiv:2308.15070

279

citations

#19

Photorealistic Video Generation with Diffusion Models

Agrim Gupta, Lijun Yu, Kihyuk Sohn et al.

ECCV 2024posterarXiv:2312.06662

270

citations

#20

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

Yinghao Xu, Zifan Shi, Wang Yifan et al.

ECCV 2024posterarXiv:2403.14621

259

citations

#21

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

Kai Zhang, Sai Bi, Hao Tan et al.

ECCV 2024posterarXiv:2404.19702

246

citations

#22

MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier et al.

ECCV 2024posterarXiv:2403.09611

246

citations

#23

Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization

Tao Yang, Rongyuan Wu, Peiran Ren et al.

ECCV 2024posterarXiv:2308.14469

242

citations

#24

DriveDreamer: Towards Real-world-driven World Models for Autonomous Driving

Xiaofeng Wang, Zheng Zhu, Guan Huang et al.

ECCV 2024posterarXiv:2309.09777

234

citations

#25

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Shenhao Zhu, Junming Chen, Zuozhuo Dai et al.

ECCV 2024posterarXiv:2403.14781

233

citations

#26

Segment and Recognize Anything at Any Granularity

Feng Li, Hao Zhang, Peize Sun et al.

ECCV 2024posterarXiv:2307.04767

226

citations

#27

PixArt-Sigma: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Junsong Chen, Chongjian GE, Enze Xie et al.

GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image

Xiao Fu, Wei Yin, Mu Hu et al.

ECCV 2024posterarXiv:2403.12013

223

citations

#29

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Linrui Tian, Qi Wang, Bang Zhang et al.

ECCV 2024posterarXiv:2402.17485

218

citations

#30

InternVideo2: Scaling Foundation Models for Multimodal Video Understanding

Yi Wang, Kunchang Li, Xinhao Li et al.

ECCV 2024posterarXiv:2403.15377

214

citations

#31

CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model

Zhengyi Wang, Yikai Wang, Yifei Chen et al.

ECCV 2024posterarXiv:2403.05034

213

citations

#32

Agent Attention: On the Integration of Softmax and Linear Attention

Dongchen Han, Tianzhu Ye, Yizeng Han et al.

ECCV 2024posterarXiv:2312.08874

206

citations

#33

ZigMa: A DiT-style Zigzag Mamba Diffusion Model

Tao Hu, Stefan Andreas Baumann, Ming Gui et al.

ECCV 2024posterarXiv:2403.13802

188

citations

#34

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models

Xin Liu, Yichen Zhu, Jindong Gu et al.

ECCV 2024posterarXiv:2311.17600

183

citations

#35

CityGaussian: Real-time High-quality Large-Scale Scene Rendering with Gaussians

Yang Liu, Chuanchen Luo, Lue Fan et al.

ECCV 2024posterarXiv:2404.01133

180

citations

#36

HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression

Yihang Chen, Qianyi Wu, Weiyao Lin et al.

ECCV 2024posterarXiv:2403.14530

179

citations

#37

Mini-Splatting: Representing Scenes with a Constrained Number of Gaussians

Guangchi Fang, Bing Wang

ECCV 2024posterarXiv:2403.14166

175

citations

#38

SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models

Yuwei Guo, Ceyuan Yang, Anyi Rao et al.

ECCV 2024posterarXiv:2311.16933

171

citations

#39

Sapiens: Foundation for Human Vision Models

Rawal Khirodkar, Timur Bagautdinov, Julieta Martinez et al.

ECCV 2024posterarXiv:2408.12569

170

citations

#40

LLaVA-UHD: an LMM Perceiving any Aspect Ratio and High-Resolution Images

Zonghao Guo, Ruyi Xu, Yuan Yao et al.

ECCV 2024posterarXiv:2403.11703

170

citations

#41

ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs

Viraj Shah, Nataniel Ruiz, Forrester Cole et al.

ECCV 2024posterarXiv:2311.13600

167

citations

#42

OccWorld: Learning a 3D Occupancy World Model for Autonomous Driving

Wenzhao Zheng, Weiliang Chen, Yuanhui Huang et al.

ECCV 2024posterarXiv:2311.16038

167

citations

#43

BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion

Xuan JU, Xian Liu, Xintao Wang et al.

ECCV 2024posterarXiv:2403.06976

163

citations

#44

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

Keen You, Haotian Zhang, Eldon Schoop et al.

ECCV 2024posterarXiv:2404.05719

154

citations

#45

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

Junhao Zhuang, Yanhong Zeng, WENRAN LIU et al.

ECCV 2024posterarXiv:2312.03594

152

citations

#46

Generative End-to-End Autonomous Driving

Wenzhao Zheng, Ruiqi Song, Xianda Guo et al.

ECCV 2024posterarXiv:2402.11502

150

citations

#47

Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting

Yunzhi Yan, Haotong Lin, Chenxu Zhou et al.

ECCV 2024posterarXiv:2401.01339

149

citations

#48

ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback

Ming Li, Taojiannan Yang, Huafeng Kuang et al.

ECCV 2024posterarXiv:2404.07987

148

citations

#49

Physics-Based Interaction with 3D Objects via Video Generation

Tianyuan Zhang, Hong-Xing Yu, Rundi Wu et al.

ECCV 2024posterarXiv:2404.13026

137

citations

#50

Rotary Position Embedding for Vision Transformer

Byeongho Heo, Song Park, Dongyoon Han et al.

ECCV 2024posterarXiv:2403.13298

135

citations

#51

SGS-SLAM: Semantic Gaussian Splatting For Neural Dense SLAM

Mingrui Li, Shuhong Liu, Heng Zhou et al.

ECCV 2024posterarXiv:2402.03246

131

citations

#52

LongVLM: Efficient Long Video Understanding via Large Language Models

Yuetian Weng, Mingfei Han, Haoyu He et al.

ECCV 2024posterarXiv:2404.03384

128

citations

#53

LHRS-Bot: Empowering Remote Sensing with VGI-Enhanced Large Multimodal Language Model

Dilxat Muhtar, Zhenshi Li, Feng Gu et al.

ECCV 2024posterarXiv:2402.02544

127

citations

#54

UniIR: Training and Benchmarking Universal Multimodal Information Retrievers

Cong Wei, Yang Chen, Haonan Chen et al.

ECCV 2024posterarXiv:2311.17136

127

citations

#55

Dolphins: Multimodal Language Model for Driving

Yingzi Ma, Yulong Cao, Jiachen Sun et al.

ECCV 2024posterarXiv:2312.00438

126

citations

#56

ST-LLM: Large Language Models Are Effective Temporal Learners

Ruyang Liu, Chen Li, Haoran Tang et al.

ECCV 2024posterarXiv:2404.00308

125

citations

#57

Paying More Attention to Images: A Training-Free Method for Alleviating Hallucination in LVLMs

Shi Liu, Kecheng Zheng, Wei Chen

ECCV 2024posterarXiv:2407.21771

121

citations

#58

Drag Anything: Motion Control for Anything using Entity Representation

Weijia Wu, Zhuang Li, Yuchao Gu et al.

SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference

Feng Wang, Jieru Mei, Alan Yuille

ECCV 2024posterarXiv:2312.01597

120

citations

#60

Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models

Rohit Gandikota, Joanna Materzynska, Tingrui Zhou et al.

ECCV 2024posterarXiv:2311.12092

120

citations

#61

DiffiT: Diffusion Vision Transformers for Image Generation

Ali Hatamizadeh, Jiaming Song, Guilin Liu et al.

ECCV 2024posterarXiv:2312.02139

119

citations

#62

MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model

Wenxun Dai, Ling-Hao Chen, Jingbo Wang et al.

ECCV 2024posterarXiv:2404.19759

117

citations

#63

InstructIR: High-Quality Image Restoration Following Human Instructions

Marcos Conde, Gregor Geigle, Radu Timofte

ECCV 2024posterarXiv:2401.16468

114

citations

#64

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models

Hao Zhang, Hongyang Li, Feng Li et al.

ECCV 2024posterarXiv:2312.02949

114

citations

#65

SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow

Yihan Wang, Lahav Lipson, Jia Deng

ECCV 2024posterarXiv:2405.14793

113

citations

#66

Implicit Style-Content Separation using B-LoRA

Yarden Frenkel, Yael Vinker, Ariel Shamir et al.

ECCV 2024posterarXiv:2403.14572

113

citations

#67

DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis with 3D Gaussian Splatting

Angelos Kratimenos, Jiahui Lei, Kostas Daniilidis

ECCV 2024posterarXiv:2312.00112

113

citations

#68

ShapeLLM: Universal 3D Object Understanding for Embodied Interaction

Zekun Qi, Runpei Dong, Shaochen Zhang et al.

ECCV 2024posterarXiv:2402.17766

113

citations

#69

Reason2Drive: Towards Interpretable and Chain-based Reasoning for Autonomous Driving

Ming Nie, Renyuan Peng, Chunwei Wang et al.

ECCV 2024posterarXiv:2312.03661

112

citations

#70

IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection

Mingjin Zhang, Yuchun Wang, Jie Guo et al.

ECCV 2024posterarXiv:2407.07520

110

citations

#71

Gaussian in the wild: 3D Gaussian Splatting for Unconstrained Image Collections

Dongbin Zhang, Chuming Wang, Weitao Wang et al.

ECCV 2024posterarXiv:2403.15704

109

citations

#72

Motion Mamba: Efficient and Long Sequence Motion Generation

Zeyu Zhang, Akide Liu, Ian Reid et al.

ECCV 2024posterarXiv:2403.07487

108

citations

#73

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

Chuofan Ma, Yi Jiang, Jiannan Wu et al.

ECCV 2024posterarXiv:2404.13013

107

citations

#74

CompGS: Smaller and Faster Gaussian Splatting with Vector Quantization

K L Navaneet, Kossar Pourahmadi, Soroush Abbasi Koohpayegani et al.

ECCV 2024posterarXiv:2311.18159

106

citations

#75

ManiGaussian: Dynamic Gaussian Splatting for Multi-task Robotic Manipulation

Guanxing Lu, Shiyi Zhang, Ziwei Wang et al.

ECCV 2024posterarXiv:2403.08321

106

citations

#76

ReNoise: Real Image Inversion Through Iterative Noising

Daniel Garibi, Or Patashnik, Andrey Voynov et al.

ECCV 2024posterarXiv:2403.14602

105

citations

#77

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Shaowei Liu, Zhongzheng Ren, Saurabh Gupta et al.

ECCV 2024posterarXiv:2409.18964

104

citations

#78

TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering

Jingye Chen, Yupan Huang, Tengchao Lv et al.

ECCV 2024posterarXiv:2311.16465

104

citations

#79

How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs

Haoqin Tu, Chenhang Cui, Zijun Wang et al.

ECCV 2024posterarXiv:2311.16101

103

citations

#80

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

Shufan Li, Aditya Grover, Harkanwar Singh

ECCV 2024posterarXiv:2402.05892

103

citations

#81

latentSplat: Autoencoding Variational Gaussians for Fast Generalizable 3D Reconstruction

Christopher Wewer, Kevin Raj, Eddy Ilg et al.

ECCV 2024posterarXiv:2403.16292

102

citations

#82

MVDiffHD: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction

Shitao Tang, Jiacheng Chen, Dilin Wang et al.

DiffusionDepth: Diffusion Denoising Approach for Monocular Depth Estimation

Yiqun Duan, Xianda Guo, Zheng Zhu

ECCV 2024posterarXiv:2303.05021

citations

#84

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

MohammadReza Davari, Eugene Belilovsky

ECCV 2024posterarXiv:2312.06795

citations

#85

LightenDiffusion: Unsupervised Low-Light Image Enhancement with Latent-Retinex Diffusion Models

Hai Jiang, Ao Luo, Xiaohong Liu et al.

ECCV 2024posterarXiv:2407.08939

citations

#86

DOCCI: Descriptions of Connected and Contrasting Images

Yasumasa Onoe, Sunayana Rane, Zachary E Berger et al.

ECCV 2024posterarXiv:2404.19753

citations

#87

Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

Zheng Zhang, WENBO HU, Yixing Lao et al.

ECCV 2024posterarXiv:2403.15530

citations

#88

Revising Densification in Gaussian Splatting

Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder

ECCV 2024posterarXiv:2404.06109

citations

#89

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction

Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang et al.

ECCV 2024posterarXiv:2405.17429

citations

#90

Images are Achilles' Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking Multimodal Large Language Models

Yifan Li, hangyu guo, Kun Zhou et al.

ECCV 2024posterarXiv:2403.09792

citations

#91

VISA: Reasoning Video Object Segmentation via Large Language Model

Cilin Yan, haochen wang, Shilin Yan et al.

ECCV 2024posterarXiv:2407.11325

citations

#92

Towards Open-ended Visual Quality Comparison

Haoning Wu, Hanwei Zhu, Zicheng Zhang et al.

ECCV 2024posterarXiv:2402.16641

citations

#93

STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians

Yifei Zeng, Yanqin Jiang, Siyu Zhu et al.

ECCV 2024posterarXiv:2403.14939

citations

#94

Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance

Liting Lin, Heng Fan, Zhipeng Zhang et al.

ECCV 2024posterarXiv:2403.05231

citations

#95

Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models

Zhiyuan You, Zheyuan Li, Jinjin Gu et al.

ECCV 2024posterarXiv:2312.08962

citations

#96

CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization

Jiawei Zhang, Jiahe Li, Xiaohan Yu et al.

ECCV 2024posterarXiv:2405.12110

citations

#97

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models

Haoran Wei, Lingyu Kong, Jinyue Chen et al.

ECCV 2024posterarXiv:2312.06109

citations

#98

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

Weiyun Wang Weiyun, yiming ren, Haowen Luo et al.

ECCV 2024posterarXiv:2402.19474

citations

#99

AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion

yitong jiang, Zhaoyang Zhang, Tianfan Xue et al.

ECCV 2024posterarXiv:2310.10123

citations

#100

T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Qing Jiang, Feng Li, Zhaoyang Zeng et al.

ECCV 2024posterarXiv:2403.14610

citations

#101

OpenIns3D: Snap and Lookup for 3D Open-vocabulary Instance Segmentation

Zhening Huang, Xiaoyang Wu, Xi Chen et al.

ECCV 2024posterarXiv:2309.00616

citations

#102

Robo-ABC: Affordance Generalization Beyond Categories via Semantic Correspondence for Robot Manipulation

Yuanchen Ju, Kaizhe Hu, Guowei Zhang et al.

ECCV 2024posterarXiv:2401.07487

citations

#103

PSALM: Pixelwise Segmentation with Large Multi-modal Model

Zheng Zhang, YeYao Ma, Enming Zhang et al.

ECCV 2024posterarXiv:2403.14598

citations

#104

Octopus: Embodied Vision-Language Programmer from Environmental Feedback

Jingkang Yang, Yuhao Dong, Shuai Liu et al.

ECCV 2024posterarXiv:2310.08588

citations

#105

Arc2Face: A Foundation Model for ID-Consistent Human Faces

Foivos Paraperas Papantoniou, Alexandros Lattas, Stylianos Moschoglou et al.

ECCV 2024posterarXiv:2403.11641

citations

#106

CG-SLAM: Efficient Dense RGB-D SLAM in a Consistent Uncertainty-aware 3D Gaussian Field

Jiarui Hu, Xianhao Chen, Boyin Feng et al.

ECCV 2024posterarXiv:2403.16095

citations

#107

NavGPT-2: Unleashing Navigational Reasoning Capability for Large Vision-Language Models

Gengze Zhou, Yicong Hong, Zun Wang et al.

ECCV 2024posterarXiv:2407.12366

citations

#108

Deblurring 3D Gaussian Splatting

Byeonghyeon Lee, Howoong Lee, Xiangyu Sun et al.

ECCV 2024posterarXiv:2401.00834

citations

#109

TLControl: Trajectory and Language Control for Human Motion Synthesis

WEILIN WAN, Zhiyang Dou, Taku Komura et al.

ECCV 2024posterarXiv:2311.17135

citations

#110

EMDM: Efficient Motion Diffusion Model for Fast, High-Quality Human Motion Generation

Wenyang Zhou, Zhiyang Dou, Zeyu Cao et al.

ECCV 2024poster

citations

#111

Distilling Diffusion Models into Conditional GANs

Minguk Kang, Richard Zhang, Connelly Barnes et al.

ECCV 2024posterarXiv:2405.05967

citations

#112

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation

Yushi Lan, Fangzhou Hong, Shuai Yang et al.

ECCV 2024posterarXiv:2403.12019

citations

#113

A Unified Anomaly Synthesis Strategy with Gradient Ascent for Industrial Anomaly Detection and Localization

Qiyu Chen, Huiyuan Luo, Chengkan Lv et al.

ECCV 2024posterarXiv:2407.09359

citations

#114

Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization

Renjie Pi, Tianyang Han, Wei Xiong et al.

ECCV 2024posterarXiv:2403.08730

citations

#115

BAD-Gaussians: Bundle Adjusted Deblur Gaussian Splatting

Lingzhe Zhao, Peng Wang, Peidong Liu

ECCV 2024posterarXiv:2403.11831

citations

#116

V2X-Real: a Largs-Scale Dataset for Vehicle-to-Everything Cooperative Perception

Hao Xiang, Xin Xia, Zhaoliang Zheng et al.

ECCV 2024posterarXiv:2403.16034

citations

#117

Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting

Jeongmin Bae, Seoha Kim, Youngsik Yun et al.

ECCV 2024posterarXiv:2404.03613

citations

#118

Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Choi Yisol, Sangkyung Kwak, Kyungmin Lee et al.

ECCV 2024posterarXiv:2403.05139

citations

#119

Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation

Homanga Bharadhwaj, Roozbeh Mottaghi, Abhinav Gupta et al.

ECCV 2024posterarXiv:2405.01527

citations

#120

SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution

mingjun zheng, Long Sun, Jiangxin Dong et al.

ECCV 2024poster

citations

#121

Model Stock: All we need is just a few fine-tuned models

Dong-Hwan Jang, Sangdoo Yun, Dongyoon Han

ECCV 2024posterarXiv:2403.19522

citations

#122

Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation

Zhenliang Ni, Xinghao Chen, Yingjie Zhai et al.

ECCV 2024posterarXiv:2405.06228

citations

#123

When Do We Not Need Larger Vision Models?

Baifeng Shi, Ziyang Wu, Maolin Mao et al.

ECCV 2024posterarXiv:2403.13043

citations

#124

RGBD GS-ICP SLAM

Seongbo Ha, Jiung Yeon, Hyeonwoo Yu

ECCV 2024posterarXiv:2403.12550

citations

#125

Large-scale Reinforcement Learning for Diffusion Models

Yinan Zhang, Eric Tzeng, Yilun Du et al.

ECCV 2024posterarXiv:2401.12244

citations

#126

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

Yanguang Sun, Chunyan Xu, Jian Yang et al.

ECCV 2024posterarXiv:2409.01686

citations

#127

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

XINJIE ZHANG, Xingtong Ge, Tongda Xu et al.

ECCV 2024posterarXiv:2403.08551

citations

#128

ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference

Mengcheng Lan, Chaofeng Chen, Yiping Ke et al.

ECCV 2024posterarXiv:2407.12442

citations

#129

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting

Shijie Zhou, Zhiwen Fan, Dejia Xu et al.

ECCV 2024posterarXiv:2404.06903

citations

#130

Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer

Eric Brachmann, Jamie Wynn, Shuai Chen et al.

ECCV 2024posterarXiv:2404.14351

citations

#131

End-to-End Rate-Distortion Optimized 3D Gaussian Representation

Henan Wang, Hanxin Zhu, Tianyu He et al.

ECCV 2024posterarXiv:2406.01597

citations

#132

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

Yufu Wang, Ziyun Wang, Lingjie Liu et al.

ECCV 2024posterarXiv:2403.17346

citations

#133

OneRestore: A Universal Restoration Framework for Composite Degradation

Yu Guo, Yuan Gao, Yuxu Lu et al.

ECCV 2024posterarXiv:2407.04621

citations

#134

Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification

Yunlong Zhang, Honglin Li, YUXUAN SUN et al.

ECCV 2024posterarXiv:2311.07125

citations

#135

TC4D: Trajectory-Conditioned Text-to-4D Generation

Sherwin Bahmani, Xian Liu, Wang Yifan et al.

ECCV 2024posterarXiv:2403.17920

citations

#136

Unifying 3D Vision-Language Understanding via Promptable Queries

ziyu zhu, Zhuofan Zhang, Xiaojian Ma et al.

ECCV 2024posterarXiv:2405.11442

citations

#137

Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

Fabien Baradel, Thomas Lucas, Matthieu Armando et al.

ECCV 2024posterarXiv:2402.14654

citations

#138

GIVT: Generative Infinite-Vocabulary Transformers

Michael Tschannen, Cian Eastwood, Fabian Mentzer

ECCV 2024posterarXiv:2312.02116

citations

#139

Language-Image Pre-training with Long Captions

Kecheng Zheng, Yifei Zhang, Wei Wu et al.

ECCV 2024posterarXiv:2403.17007

citations

#140

GS2Mesh: Surface Reconstruction from Gaussian Splatting via Novel Stereo Views

Yaniv Wolf, Amit Bracha, Ron Kimmel

ECCV 2024posterarXiv:2404.01810

citations

#141

Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery

Sukrut Rao, Sweta Mahajan, Moritz Böhle et al.

ECCV 2024posterarXiv:2407.14499

citations

#142

SkateFormer: Skeletal-Temporal Transformer for Human Action Recognition

Jeonghyeok Do, Munchurl Kim

ECCV 2024posterarXiv:2403.09508

citations

#143

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024posterarXiv:2407.11699

citations

#144

DINO-Tracker: Taming DINO for Self-Supervised Point Tracking in a Single Video

Narek Tumanyan, Assaf Singer, Shai Bagon et al.

ECCV 2024posterarXiv:2403.14548

citations

#145

Large Motion Model for Unified Multi-Modal Motion Generation

Mingyuan Zhang, Daisheng Jin, Chenyang Gu et al.

ECCV 2024posterarXiv:2404.01284

citations

#146

GraspXL: Generating Grasping Motions for Diverse Objects at Scale

Hui Zhang, Sammy Christen, Zicong Fan et al.

ECCV 2024posterarXiv:2403.19649

citations

#147

TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Jiahe Li, Jiawei Zhang, Xiao Bai et al.

ECCV 2024posterarXiv:2404.15264

citations

#148

VeCLIP: Improving CLIP Training via Visual-enriched Captions

Zhengfeng Lai, Haotian Zhang, Bowen Zhang et al.

ECCV 2024posterarXiv:2310.07699

citations

#149

Diffusion Models for Open-Vocabulary Segmentation

Laurynas Karazija, Iro Laina, Andrea Vedaldi et al.

ECCV 2024posterarXiv:2306.09316

citations

#150

SILC: Improving Vision Language Pretraining with Self-Distillation

Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai et al.

ECCV 2024posterarXiv:2310.13355

citations

#151

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao et al.

ECCV 2024posterarXiv:2310.17796

citations

#152

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation

Yi-Chia Chen, WeiHua Li, Cheng Sun et al.

ECCV 2024posterarXiv:2409.10542

citations

#153

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

hang yao, Ming LIU, Zhicun Yin et al.

ECCV 2024posterarXiv:2406.07487

citations

#154

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Daniel Winter, Matan Cohen, Shlomi Fruchter et al.

ECCV 2024posterarXiv:2403.18818

citations

#155

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai et al.

ECCV 2024posterarXiv:2404.03507

citations

#156

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction

Bencheng Liao, Shaoyu Chen, Bo Jiang et al.

ECCV 2024posterarXiv:2303.08815

citations

#157

Improving 2D Feature Representations by 3D-Aware Fine-Tuning

Yuanwen Yue, Anurag Das, Francis Engelmann et al.

ECCV 2024posterarXiv:2407.20229

citations

#158

OmniSat: Self-Supervised Modality Fusion for Earth Observation

Guillaume Astruc, Nicolas Gonthier, Clement Mallet et al.

ECCV 2024posterarXiv:2404.08351

citations

#159

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

Zhen Qu, Xian Tao, Mukesh Prasad et al.

ECCV 2024posterarXiv:2407.12276

citations

#160

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Kangle Deng, Timothy Omernick, Alexander B Weiss et al.

ECCV 2024posterarXiv:2402.13251

citations

#161

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Xi Chen, Sida Peng, Dongchen Yang et al.

ECCV 2024posterarXiv:2404.11593

citations

#162

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping

Jiacheng Chen, Yuefan Wu, Tan Jiaqi et al.

ECCV 2024posterarXiv:2403.15951

citations

#163

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Yue Han, Junwei Zhu, Keke He et al.

ECCV 2024posterarXiv:2405.12970

citations

#164

A Comparative Study of Image Restoration Networks for General Backbone Network Design

Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.

ECCV 2024posterarXiv:2310.11881

citations

#165

Latent Guard: a Safety Framework for Text-to-image Generation

Runtao Liu, Ashkan Khakzar, Jindong Gu et al.

ECCV 2024posterarXiv:2404.08031

citations

#166

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Zeyu Liu, Weicong Liang, Zhanhao Liang et al.

ECCV 2024posterarXiv:2403.09622

citations

#167

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024posterarXiv:2403.12034

citations

#168

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.

ECCV 2024posterarXiv:2305.16037

citations

#169

LaRa: Efficient Large-Baseline Radiance Fields

Anpei Chen, Haofei Xu, Stefano Esposito et al.

ECCV 2024posterarXiv:2407.04699

citations

#170

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

Xiang Zhang, Yulun Zhang, Fisher Yu

ECCV 2024posterarXiv:2407.05878

citations

#171

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Yanyan Li, Chenyu Lyu, Yan Di et al.

ECCV 2024posterarXiv:2403.11324

citations

#172

ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions

Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik et al.

ECCV 2024posterarXiv:2311.17057

citations

#173

GVGEN: Text-to-3D Generation with Volumetric Representation

Xianglong He, Junyi Chen, Sida Peng et al.

ECCV 2024posterarXiv:2403.12957

citations

#174

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024posterarXiv:2403.12963

citations

#175

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo, Yinyu Nie, Arthur Moreau et al.

ECCV 2024posterarXiv:2312.02902

citations

#176

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Qilang Ye, Zitong Yu, Rui Shao et al.

ECCV 2024posterarXiv:2403.04640

citations

#177

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

David Wan, Jaemin Cho, Elias Stengel-Eskin et al.

ECCV 2024posterarXiv:2403.02325

citations

#178

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Chongyu Fan, Jiancheng Liu, Alfred Hero et al.

ECCV 2024posterarXiv:2403.07362

citations

#179

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Lanqing Guo, Yingqing He, Haoxin Chen et al.

ECCV 2024posterarXiv:2402.10491

citations

#180

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction

Zihao Liu, Xiaoyu Zhang, Guangwei Liu et al.

ECCV 2024posterarXiv:2402.17430

citations

#181

TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models

Aditya Aravind Chinchure, Pushkar Shukla, Gaurav Bhatt et al.

ECCV 2024posterarXiv:2312.01261

citations

#182

ReMamber: Referring Image Segmentation with Mamba Twister

Yuhuan Yang, Chaofan Ma, Jiangchao Yao et al.

ECCV 2024posterarXiv:2403.17839

citations

#183

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Shicheng Li, Lei Li, Yi Liu et al.

ECCV 2024posterarXiv:2311.17404

citations

#184

LCM-Lookahead for Encoder-based Text-to-Image Personalization

Rinon Gal, Or Lichter, Elad Richardson et al.

ECCV 2024posterarXiv:2404.03620

citations

#185

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

Zexiang Liu, Yangguang Li, Youtian Lin et al.

ECCV 2024posterarXiv:2312.08754

citations

#186

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, WEILIN WAN, Yue Yang et al.

ECCV 2024posterarXiv:2403.13900

citations

#187

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan et al.

ECCV 2024posterarXiv:2402.03094

citations

#188

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Yuru Jia, Lukas Hoyer, Shengyu Huang et al.

ECCV 2024posterarXiv:2312.03048

citations

#189

UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction

Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud et al.

ECCV 2024posterarXiv:2403.15098

citations

#190

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Yiming Zhao, Zhouhui Lian

ECCV 2024posterarXiv:2312.04884

citations

#191

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

Yixuan Ren, Yang Zhou, Jimei Yang et al.

ECCV 2024posterarXiv:2402.14780

citations

#192

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer

Yu Deng, Duomin Wang, Baoyuan Wang

ECCV 2024posterarXiv:2403.13570

citations

#193

DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM

Yixuan Wu, Yizhou Wang, Shixiang Tang et al.

ECCV 2024posterarXiv:2403.12488

citations

#194

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

Jie Ren, Yaxin Li, Shenglai Zeng et al.

ECCV 2024posterarXiv:2403.11052

citations

#195

When Fast Fourier Transform Meets Transformer for Image Restoration

xingyu jiang, Xiuhui Zhang, Ning Gao et al.

ECCV 2024poster

citations

#196

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

Yang Zheng, Qingqing Zhao, Guandao Yang et al.

ECCV 2024posterarXiv:2404.04421

citations

#197

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Qiuhong Shen, Xingyi Yang, Xinchao Wang

ECCV 2024posterarXiv:2409.08270

citations

#198

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance

Yongwei Chen, Tengfei Wang, Tong Wu et al.

ECCV 2024posterarXiv:2403.12409

citations

#199

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Tong Shao, Zhuotao Tian, Hang Zhao et al.

ECCV 2024posterarXiv:2407.08268

citations

#200

BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

Rizhao Cai, Zirui Song, DAYAN GUAN et al.

ECCV 2024posterarXiv:2312.02896

citations

← Previous

1 2 3...12