Most Cited ECCV "top-k selection" Papers

2,387 papers found • Page 2 of 12

Filters:Most Cited ECCV top-k selection Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#201

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Tong Shao, Zhuotao Tian, Hang Zhao et al.

ECCV 2024posterarXiv:2407.08268

citations

#202

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh, Adam Tonderski, Joakim Johnander et al.

ECCV 2024posterarXiv:2404.07762

citations

#203

DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model

Li Xiaofan, Zhang Yifu, Xiaoqing Ye

ECCV 2024poster

citations

#204

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Qianjiang Hu, Zhimin Zhang, Wei Hu

ECCV 2024posterarXiv:2403.10094

citations

#205

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang et al.

ECCV 2024poster

citations

#206

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model

Yulin Luo, Ruichuan An, Bocheng Zou et al.

ECCV 2024posterarXiv:2405.02363

citations

#207

ParCo: Part-Coordinating Text-to-Motion Synthesis

Qiran Zou, Shangyuan Yuan, Shian Du et al.

ECCV 2024posterarXiv:2403.18512

citations

#208

Diffusion Reward: Learning Rewards via Conditional Video Diffusion

Tao Huang, Guangqi Jiang, Yanjie Ze et al.

ECCV 2024posterarXiv:2312.14134

citations

#209

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024posterarXiv:2403.10893

citations

#210

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024posterarXiv:2403.19435

citations

#211

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024posterarXiv:2407.14143

citations

#212

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Letian Huang, Jiayang Bai, Jie Guo et al.

ECCV 2024posterarXiv:2402.00752

citations

#213

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024posterarXiv:2311.12897

citations

#214

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024posterarXiv:2401.06825

citations

#215

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024posterarXiv:2401.09112

citations

#216

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024posterarXiv:2403.19588

citations

#217

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024posterarXiv:2312.07381

citations

#218

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024posterarXiv:2403.19589

citations

#219

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024posterarXiv:2403.18118

citations

#220

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024posterarXiv:2403.10050

citations

#221

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ECCV 2024posterarXiv:2311.09999

citations

#222

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

Chao Xu, Ang Li, Linghao Chen et al.

ECCV 2024posterarXiv:2408.10195

citations

#223

GalLop: Learning global and local prompts for vision-language models

Marc Lafon, Elias Ramzi, Clément Rambour et al.

ECCV 2024posterarXiv:2407.01400

citations

#224

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Mengting Chen, Xi Chen, Zhonghua Zhai et al.

ECCV 2024posterarXiv:2403.12965

citations

#225

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024posterarXiv:2403.13129

citations

#226

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

Matteo Bortolon, Theodoros Tsesmelis, Stuart James et al.

ECCV 2024posterarXiv:2407.15484

citations

#227

SMooDi: Stylized Motion Diffusion Model

Lei Zhong, Yiming Xie, Varun Jampani et al.

ECCV 2024posterarXiv:2407.12783

citations

#228

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937

citations

#229

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024posterarXiv:2407.13761

citations

#230

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao Tianchen, Xuefei Ning, Tongcheng Fang et al.

ECCV 2024posterarXiv:2405.17873

citations

#231

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024posterarXiv:2409.09756

citations

#232

Pyramid Diffusion for Fine 3D Large Scene Generation

Yuheng Liu, Xinke Li, Xueting Li et al.

ECCV 2024posterarXiv:2311.12085

citations

#233

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024posterarXiv:2403.18274

citations

#234

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024posterarXiv:2311.13627

citations

#235

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

ECCV 2024posterarXiv:2407.10862

citations

#236

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Tan, Muhammad Ferjad Naeem et al.

ECCV 2024posterarXiv:2311.16241

citations

#237

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed abdelrahman et al.

ECCV 2024posterarXiv:2407.12679

citations

#238

DragVideo: Interactive Drag-style Video Editing

Yufan Deng, Ruida Wang, Yuhao ZHANG et al.

ECCV 2024posterarXiv:2312.02216

citations

#239

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

Junjie Guo, Chenqiang Gao, Fangcen liu et al.

ECCV 2024posterarXiv:2403.00326

citations

#240

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024posterarXiv:2407.13519

citations

#241

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024posterarXiv:2403.05023

citations

#242

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024posterarXiv:2311.16917

citations

#243

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024posterarXiv:2405.15763

citations

#244

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Zhijian Huang, Tao Tang, Shaoxiang Chen et al.

ECCV 2024posterarXiv:2408.13890

citations

#245

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024posterarXiv:2402.03310

citations

#246

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024posterarXiv:2312.09128

citations

#247

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

Zhen Zhao, Zicheng Wang, Dian Yu et al.

ECCV 2024posterarXiv:2311.17325

citations

#248

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024posterarXiv:2311.16567

citations

#249

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024posterarXiv:2310.06744

citations

#250

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024posterarXiv:2312.05295

citations

#251

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024poster

citations

#252

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024posterarXiv:2409.14444

citations

#253

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024posterarXiv:2305.14428

citations

#254

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024posterarXiv:2407.01702

citations

#255

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024posterarXiv:2403.11781

citations

#256

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024posterarXiv:2402.03634

citations

#257

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024posterarXiv:2407.12777

citations

#258

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024posterarXiv:2311.07152

citations

#259

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024posterarXiv:2311.11261

citations

#260

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024posterarXiv:2404.03590

citations

#261

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024posterarXiv:2403.07508

citations

#262

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024posterarXiv:2408.14176

citations

#263

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024posterarXiv:2403.05049

citations

#264

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024posterarXiv:2406.14556

citations

#265

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024posterarXiv:2403.18211

citations

#266

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024posterarXiv:2403.05659

citations

#267

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

Xinzhou Wang, Yikai Wang, junliang ye et al.

ECCV 2024posterarXiv:2312.03795

citations

#268

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

Chenxin Li, Xinyu Liu, Cheng Wang et al.

ECCV 2024posterarXiv:2407.05540

citations

#269

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024posterarXiv:2404.13679

citations

#270

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024posterarXiv:2403.04697

citations

#271

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Young Kyun Jang, Dat B Huynh, Ashish Shah et al.

ECCV 2024posterarXiv:2405.00571

citations

#272

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024posterarXiv:2312.08372

citations

#273

Exact Diffusion Inversion via Bidirectional Integration Approximation

Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn

ECCV 2024poster

citations

#274

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Siyu Jiao, hongguang Zhu, Yunchao Wei et al.

ECCV 2024posterarXiv:2408.00744

citations

#275

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024posterarXiv:2408.05475

citations

#276

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Ye Liu, Jixuan He, Wanhua Li et al.

ECCV 2024posterarXiv:2404.00801

citations

#277

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.

ECCV 2024posterarXiv:2404.03010

citations

#278

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024posterarXiv:2408.05868

citations

#279

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.

ECCV 2024posterarXiv:2407.10387

citations

#280

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024posterarXiv:2403.12445

citations

#281

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.

ECCV 2024posterarXiv:2407.20337

citations

#282

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024posterarXiv:2407.02077

citations

#283

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024posterarXiv:2310.04900

citations

#284

Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.

ECCV 2024posterarXiv:2404.19286

citations

#285

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

Xincheng Yao, Ruoqi Li, Zefeng Qian et al.

ECCV 2024posterarXiv:2403.13349

citations

#286

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.

ECCV 2024posterarXiv:2403.13327

citations

#287

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024posterarXiv:2407.09983

citations

#288

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.

ECCV 2024posterarXiv:2311.14552

citations

#289

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Jingyi Lu, Xinghui Li, Kai Han

ECCV 2024posterarXiv:2407.18247

citations

#290

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

Wangze Xu, Huachen Gao, Shihe Shen et al.

ECCV 2024posterarXiv:2409.14316

citations

#291

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.

ECCV 2024posterarXiv:2403.11755

citations

#292

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

XINYUAN GAO, Songlin Dong, Yuhang He et al.

ECCV 2024posterarXiv:2407.10281

citations

#293

Denoising Vision Transformers

Jiawei Yang, Katie Luo, Jiefeng Li et al.

ECCV 2024posterarXiv:2401.02957

citations

#294

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross et al.

ECCV 2024posterarXiv:2404.08580

citations

#295

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation

Jaejung Seol, Seojun Kim, Jaejun Yoo

ECCV 2024posterarXiv:2404.00995

citations

#296

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Yash Bhalgat, Iro Laina, Joao F Henriques et al.

ECCV 2024posterarXiv:2403.10997

citations

#297

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

Ozan Unal, Christos Sakaridis, Suman Saha et al.

ECCV 2024posterarXiv:2309.04561

citations

#298

Dataset Distillation by Automatic Training Trajectories

Dai Liu, Jindong Gu, Hu Cao et al.

ECCV 2024posterarXiv:2407.14245

citations

#299

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.

ECCV 2024posterarXiv:2311.14109

citations

#300

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024posterarXiv:2407.16698

citations

#301

View Selection for 3D Captioning via Diffusion Ranking

Tiange Luo, Justin Johnson, Honglak Lee

ECCV 2024posterarXiv:2404.07984

citations

#302

Nuvo: Neural UV Mapping for Unruly 3D Representations

Pratul Srinivasan, Stephan J Garbin, Dor Verbin et al.

ECCV 2024posterarXiv:2312.05283

citations

#303

Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting

Ri-Zhao Qiu, Ge Yang, Weijia Zeng et al.

ECCV 2024poster

citations

#304

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024posterarXiv:2403.09334

citations

#305

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

Sungwon Hwang, Min-Jung Kim, Taewoong Kang et al.

ECCV 2024posterarXiv:2407.02945

citations

#306

WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing

Shuokang Huang, Kaihan Li, Di You et al.

ECCV 2024posterarXiv:2402.09430

citations

#307

EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding

jiazhou zhou, Xu Zheng, Yuanhuiyi Lyu et al.

ECCV 2024posterarXiv:2308.03135

citations

#308

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Zhecan Wang, Garrett Bingham, Adams Wei Yu et al.

ECCV 2024posterarXiv:2407.15680

citations

#309

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

Zhihang Lin, Mingbao Lin, Meng Zhao et al.

ECCV 2024posterarXiv:2407.10738

citations

#310

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

Yabo Chen, Jiemin Fang, Yuyang Huang et al.

ECCV 2024posterarXiv:2312.04424

citations

#311

Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities

Xu Zheng, Yuanhuiyi Lyu, LIN WANG

ECCV 2024posterarXiv:2407.11351

citations

#312

CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model

Aoran Xiao, Weihao Xuan, Heli Qi et al.

ECCV 2024posterarXiv:2402.03631

citations

#313

Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching

Yichen Li, Wenchao Xu, Haozhao Wang et al.

ECCV 2024posterarXiv:2407.05005

citations

#314

MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos

Yushuo Chen, Zerong Zheng, Zhe Li et al.

ECCV 2024posterarXiv:2407.08414

citations

#315

WHAC: World-grounded Humans and Cameras

Wanqi Yin, Zhongang Cai, Chen Wei et al.

ECCV 2024posterarXiv:2403.12959

citations

#316

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

Zhili Chen, Maosheng Ye, Shuangjie Xu et al.

ECCV 2024posterarXiv:2311.08100

citations

#317

Attention Prompting on Image for Large Vision-Language Models

Runpeng Yu, Weihao Yu, Xinchao Wang

ECCV 2024posterarXiv:2409.17143

citations

#318

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

Ming Hu, Peng Xia, Lin Wang et al.

ECCV 2024posterarXiv:2406.07471

citations

#319

FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification

Yu Tian, Congcong Wen, Min Shi et al.

ECCV 2024posterarXiv:2407.08813

citations

#320

Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation

han li, Shaohui Li, Shuangrui Ding et al.

ECCV 2024posterarXiv:2407.09853

citations

#321

Zero-shot Object Counting with Good Exemplars

Huilin Zhu, Jingling Yuan, Zhengwei Yang et al.

ECCV 2024posterarXiv:2407.04948

citations

#322

Multistain Pretraining for Slide Representation Learning in Pathology

Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.

ECCV 2024posterarXiv:2408.02859

citations

#323

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

Liren He, Zhengkai Jiang, Jinlong Peng et al.

ECCV 2024posterarXiv:2403.11561

citations

#324

GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features

Luc Sträter, Mohammadreza Salehi, Efstratios Gavves et al.

ECCV 2024posterarXiv:2407.12427

citations

#325

Progressive Pretext Task Learning for Human Trajectory Prediction

Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.

ECCV 2024posterarXiv:2407.11588

citations

#326

The Nerfect Match: Exploring NeRF Features for Visual Localization

Qunjie Zhou, Maxim Maximov, Or Litany et al.

ECCV 2024posterarXiv:2403.09577

citations

#327

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

Heyuan Li, Ce Chen, Tianhao Shi et al.

ECCV 2024posterarXiv:2404.05680

citations

#328

Do text-free diffusion models learn discriminative visual representations?

Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.

ECCV 2024posterarXiv:2311.17921

citations

#329

Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

WEI-JER Chang, Francesco Pittaluga, Masayoshi TOMIZUKA et al.

ECCV 2024posterarXiv:2401.00391

citations

#330

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Baijiong Lin, Weisen Jiang, Pengguang Chen et al.

ECCV 2024posterarXiv:2407.02228

citations

#331

Trackastra: Transformer-based cell tracking for live-cell microscopy

Benjamin Gallusser, Weigert Martin

ECCV 2024posterarXiv:2405.15700

citations

#332

UMBRAE: Unified Multimodal Brain Decoding

Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.

ECCV 2024posterarXiv:2404.07202

citations

#333

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Zhiyuan MA, Yuxiang WEI, Yabin Zhang et al.

ECCV 2024posterarXiv:2407.02040

citations

#334

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Yizhe Xiong, Hui Chen, Tianxiang Hao et al.

ECCV 2024posterarXiv:2403.09192

citations

#335

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Chaofeng Chen, Annan Wang, Haoning Wu et al.

ECCV 2024posterarXiv:2311.15657

citations

#336

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Jinke Li, Xiao He, Chonghua Zhou et al.

ECCV 2024posterarXiv:2405.04299

citations

#337

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

Xiaobao Wei, Jiajun Cao, Yizhu Jin et al.

ECCV 2024posterarXiv:2311.17081

citations

#338

Dolfin: Diffusion Layout Transformers without Autoencoder

Yilin Wang, Zeyuan Chen, Liangjun Zhong et al.

ECCV 2024posterarXiv:2310.16305

citations

#339

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models

Zhongqi Wang, Jie Zhang, Shiguang Shan et al.

ECCV 2024posterarXiv:2407.04215

citations

#340

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models

Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.

ECCV 2024posterarXiv:2401.12978

citations

#341

SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting

Richard Shaw, Michal Nazarczuk, Song Jifei et al.

ECCV 2024posterarXiv:2312.13308

citations

#342

Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection

Ting Lei, Shaofeng Yin, Yuxin Peng et al.

ECCV 2024posterarXiv:2408.02484

citations

#343

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Sehwan Choi, Jun Won Choi, JUNGHO KIM et al.

ECCV 2024posterarXiv:2407.13517

citations

#344

LLMGA: Multimodal Large Language Model based Generation Assistant

Bin Xia, Shiyin Wang, Yingfan Tao et al.

ECCV 2024posterarXiv:2311.16500

citations

#345

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding

Talfan Evans, Shreya Pathak, Hamza Merzic et al.

ECCV 2024posterarXiv:2312.05328

citations

#346

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection

Huafeng Chen, Pengxu Wei, Guangqian Guo et al.

ECCV 2024posterarXiv:2408.10760

citations

#347

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

ZUYAN LIU, Benlin Liu, Jiahui Wang et al.

ECCV 2024posterarXiv:2407.18121

citations

#348

Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching

Meng Chu, Zhedong Zheng, Wei Ji et al.

ECCV 2024posterarXiv:2311.12751

citations

#349

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Zhengdi Yu, Shaoli Huang, yongkang cheng et al.

ECCV 2024posterarXiv:2310.20436

citations

#350

Cascade Prompt Learning for Visual-Language Model Adaptation

Ge Wu, Xin Zhang, Zheng Li et al.

ECCV 2024poster

citations

#351

LISO: Lidar-only Self-Supervised 3D Object Detection

Stefan Baur, Frank Moosmann, Andreas Geiger

ECCV 2024posterarXiv:2403.07071

citations

#352

WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering

Pingyi Chen, Chenglu Zhu, Sunyi Zheng et al.

ECCV 2024posterarXiv:2407.05603

citations

#353

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li et al.

ECCV 2024posterarXiv:2403.12574

citations

#354

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.

ECCV 2024posterarXiv:2407.03575

citations

#355

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

Yuanpeng Tu, Boshen Zhang, Liang Liu et al.

ECCV 2024posterarXiv:2401.03145

citations

#356

Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations

Tomáš Chobola, Yu Liu, Hanyi Zhang et al.

ECCV 2024posterarXiv:2407.12511

citations

#357

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.

ECCV 2024posterarXiv:2408.09702

citations

#358

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ECCV 2024posterarXiv:2312.11897

citations

#359

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.

ECCV 2024posterarXiv:2407.17850

citations

#360

Semantic Residual Prompts for Continual Learning

Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.

ECCV 2024posterarXiv:2403.06870

citations

#361

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

Tim Broedermann, David Brüggemann, Christos Sakaridis et al.

ECCV 2024posterarXiv:2401.12761

citations

#362

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Hu Zhang, xu jianhua, Tao Tang et al.

ECCV 2024posterarXiv:2312.08876

citations

#363

Enhancing Vectorized Map Perception with Historical Rasterized Maps

Xiaoyu Zhang, Guangwei Liu, Zihao Liu et al.

ECCV 2024posterarXiv:2409.00620

citations

#364

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.

ECCV 2024posterarXiv:2407.10957

citations

#365

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024posterarXiv:2407.07582

citations

#366

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024posterarXiv:2407.08476

citations

#367

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen et al.

ECCV 2024posterarXiv:2312.03849

citations

#368

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving

Zhenghao Peng, Wenjie Luo, Yiren Lu et al.

ECCV 2024posterarXiv:2409.18343

citations

#369

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ECCV 2024posterarXiv:2409.02543

citations

#370

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

Guohao Sun, Can Qin, JIAMINAN WANG et al.

ECCV 2024posterarXiv:2403.11299

citations

#371

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

ECCV 2024posterarXiv:2403.10153

citations

#372

TrojVLM: Backdoor Attack Against Vision Language Models

Weimin Lyu, Lu Pang, Tengfei Ma et al.

ECCV 2024posterarXiv:2409.19232

citations

#373

DataDream: Few-shot Guided Dataset Generation

Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.

ECCV 2024posterarXiv:2407.10910

citations

#374

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

Sixiang Chen, Tian Ye, Kai Zhang et al.

ECCV 2024posterarXiv:2409.15739

citations

#375

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

ECCV 2024posterarXiv:2409.18049

citations

#376

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

Fucai Ke, Zhixi Cai, Simindokht Jahangard et al.

ECCV 2024posterarXiv:2403.12884

citations

#377

GeoCalib: Learning Single-image Calibration with Geometric Optimization

Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger et al.

ECCV 2024posterarXiv:2409.06704

citations

#378

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao et al.

ECCV 2024posterarXiv:2306.17010

citations

#379

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

ECCV 2024posterarXiv:2404.05052

citations

#380

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024posterarXiv:2403.14617

citations

#381

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024posterarXiv:2403.09176

citations

#382

SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection

Hongcheng Zhang, Liu Liang, Pengxin Zeng et al.

ECCV 2024posterarXiv:2403.07284

citations

#383

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Jianwei Zhao, Xin Li, Fan Yang et al.

ECCV 2024posterarXiv:2407.13133

citations

#384

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain et al.

ECCV 2024posterarXiv:2403.18819

citations

#385

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.

ECCV 2024posterarXiv:2404.01300

citations

#386

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang et al.

ECCV 2024posterarXiv:2404.01700

citations

#387

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.

ECCV 2024posterarXiv:2312.09231

citations

#388

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

Jie Yang, Xuesong Niu, Nan Jiang et al.

ECCV 2024posterarXiv:2407.12435

citations

#389

RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez et al.

ECCV 2024posterarXiv:2312.12865

citations

#390

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

ECCV 2024posterarXiv:2312.06573

citations

#391

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning

Yan Li, Weiwei Guo, Xue Yang et al.

ECCV 2024posterarXiv:2311.11646

citations

#392

DIM: Dyadic Interaction Modeling for Social Behavior Generation

Minh Tran, Di Chang, Maksim Siniukov et al.

ECCV 2024poster

citations

#393

ViLA: Efficient Video-Language Alignment for Video Question Answering

Xijun Wang, Junbang Liang, Chun-Kai Wang et al.

ECCV 2024posterarXiv:2312.08367

citations

#394

Prioritized Semantic Learning for Zero-shot Instance Navigation

Xinyu Sun, Lizhao Liu, Hongyan Zhi et al.

ECCV 2024posterarXiv:2403.11650

citations

#395

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024posterarXiv:2401.05735

citations

#396

OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations

Yiming Zuo, Jia Deng

ECCV 2024posterarXiv:2406.11711

citations

#397

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ECCV 2024posterarXiv:2311.17944

citations

#398

Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer

Yang Wu, Kaihua Zhang, Jianjun Qian et al.

ECCV 2024posterarXiv:2407.19628

citations

#399

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding

Quan Kong, Yuki Kawana, Rajat Saini et al.

ECCV 2024posterarXiv:2407.15350

citations

#400

Learning to Adapt SAM for Segmenting Cross-domain Point Clouds

Xidong Peng, Runnan Chen, Feng Qiao et al.

ECCV 2024posterarXiv:2310.08820

citations

← Previous

1 2 3 4...12