Most Cited ECCV "attention" Papers

2,387 papers found • Page 3 of 12

Filters:Most Cited ECCV attention Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#401

Denoising Vision Transformers

Jiawei Yang, Katie Luo, Jiefeng Li et al.

ECCV 2024arXiv:2401.02957

citations

#402

An Economic Framework for 6-DoF Grasp Detection

Xiao-Ming Wu, Jia-Feng Cai, Jian-Jian Jiang et al.

ECCV 2024arXiv:2407.08366

citations

#403

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.

ECCV 2024arXiv:2403.13327

citations

#404

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.

ECCV 2024arXiv:2311.14552

citations

#405

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

Wangze Xu, Huachen Gao, Shihe Shen et al.

ECCV 2024arXiv:2409.14316

citations

#406

Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos

Mi Luo, Zihui Xue, Alex Dimakis et al.

ECCV 2024arXiv:2403.06351

citations

#407

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.

ECCV 2024arXiv:2407.10387

citations

#408

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

Xincheng Yao, Ruoqi Li, Zefeng Qian et al.

ECCV 2024arXiv:2403.13349

citations

#409

View Selection for 3D Captioning via Diffusion Ranking

Tiange Luo, Justin Johnson, Honglak Lee

ECCV 2024arXiv:2404.07984

citations

#410

VisionLLaMA: A Unified LLaMA Backbone for Vision Tasks

Xiangxiang Chu, Jianlin Su, Bo Zhang et al.

ECCV 2024arXiv:2403.00522

citations

#411

Agent3D-Zero: An Agent for Zero-shot 3D Understanding

Sha Zhang, Di Huang, Jiajun Deng et al.

ECCV 2024arXiv:2403.11835

citations

#412

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Yash Bhalgat, Iro Laina, Joao F Henriques et al.

ECCV 2024arXiv:2403.10997

citations

#413

MapDistill: Boosting Efficient Camera-based HD Map Construction via Camera-LiDAR Fusion Model Distillation

Xiaoshuai Hao, Ruikai Li, Hui Zhang et al.

ECCV 2024arXiv:2407.11682

citations

#414

Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction

Xinhang Liu, Jiaben Chen, Shiu-Hong Kao et al.

ECCV 2024arXiv:2305.15171

citations

#415

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.

ECCV 2024arXiv:2403.11755

citations

#416

UMBRAE: Unified Multimodal Brain Decoding

Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.

ECCV 2024arXiv:2404.07202

citations

#417

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang et al.

ECCV 2024arXiv:2403.13745

citations

#418

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation

Jaejung Seol, Seojun Kim, Jaejun Yoo

ECCV 2024arXiv:2404.00995

citations

#419

EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding

jiazhou zhou, Xu Zheng, Yuanhuiyi Lyu et al.

ECCV 2024arXiv:2308.03135

citations

#420

CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model

Aoran Xiao, Weihao Xuan, Heli Qi et al.

ECCV 2024arXiv:2402.03631

citations

#421

Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.

ECCV 2024arXiv:2404.19286

citations

#422

DragAPart: Learning a Part-Level Motion Prior for Articulated Objects

Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.

ECCV 2024arXiv:2403.15382

citations

#423

BAM-DETR: Boundary-Aligned Moment Detection Transformer for Temporal Sentence Grounding in Videos

Pilhyeon Lee, Hyeran Byun

ECCV 2024arXiv:2312.00083

citations

#424

WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing

Shuokang Huang, Kaihan Li, Di You et al.

ECCV 2024arXiv:2402.09430

citations

#425

DiffClass: Diffusion-Based Class Incremental Learning

Zichong Meng, Jie Zhang, Changdi Yang et al.

ECCV 2024arXiv:2403.05016

citations

#426

Learning Natural Consistency Representation for Face Forgery Video Detection

Daichi Zhang, Zihao Xiao, Shikun Li et al.

ECCV 2024arXiv:2407.10550

citations

#427

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Zhecan Wang, Garrett Bingham, Adams Wei Yu et al.

ECCV 2024arXiv:2407.15680

citations

#428

Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection

Hu Cao, Zehua Zhang, Yan Xia et al.

ECCV 2024arXiv:2407.12582

citations

#429

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024arXiv:2403.09334

citations

#430

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

Xiaobao Wei, Jiajun Cao, Yizhu Jin et al.

ECCV 2024arXiv:2311.17081

citations

#431

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024arXiv:2407.16698

citations

#432

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

Ozan Unal, Christos Sakaridis, Suman Saha et al.

ECCV 2024arXiv:2309.04561

citations

#433

Dataset Distillation by Automatic Training Trajectories

Dai Liu, Jindong Gu, Hu Cao et al.

ECCV 2024arXiv:2407.14245

citations

#434

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.

ECCV 2024arXiv:2311.14109

citations

#435

Nuvo: Neural UV Mapping for Unruly 3D Representations

Pratul Srinivasan, Stephan J Garbin, Dor Verbin et al.

ECCV 2024arXiv:2312.05283

citations

#436

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Baijiong Lin, Weisen Jiang, Pengguang Chen et al.

ECCV 2024arXiv:2407.02228

citations

#437

Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching

Yichen Li, Wenchao Xu, Haozhao Wang et al.

ECCV 2024arXiv:2407.05005

citations

#438

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

Zhili Chen, Maosheng Ye, Shuangjie Xu et al.

ECCV 2024arXiv:2311.08100

citations

#439

WHAC: World-grounded Humans and Cameras

Wanqi Yin, Zhongang Cai, Chen Wei et al.

ECCV 2024arXiv:2403.12959

citations

#440

HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects

Xintao Lv, Liang Xu, Yichao Yan et al.

ECCV 2024arXiv:2407.12371

citations

#441

Human Hair Reconstruction with Strand-Aligned 3D Gaussians

Egor Zakharov, Vanessa Sklyarova, Michael J. Black et al.

ECCV 2024arXiv:2409.14778

citations

#442

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

Liren He, Zhengkai Jiang, Jinlong Peng et al.

ECCV 2024arXiv:2403.11561

citations

#443

The Nerfect Match: Exploring NeRF Features for Visual Localization

Qunjie Zhou, Maxim Maximov, Or Litany et al.

ECCV 2024arXiv:2403.09577

citations

#444

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models

Zhongqi Wang, Jie Zhang, Shiguang Shan et al.

ECCV 2024arXiv:2407.04215

citations

#445

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

Sungwon Hwang, Min-Jung Kim, Taewoong Kang et al.

ECCV 2024arXiv:2407.02945

citations

#446

Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation

han li, Shaohui Li, Shuangrui Ding et al.

ECCV 2024arXiv:2407.09853

citations

#447

4D Contrastive Superflows are Dense 3D Representation Learners

Xiang Xu, Lingdong Kong, Hui Shuai et al.

ECCV 2024arXiv:2407.06190

citations

#448

Trackastra: Transformer-based cell tracking for live-cell microscopy

Benjamin Gallusser, Weigert Martin

ECCV 2024arXiv:2405.15700

citations

#449

Occupancy as Set of Points

Yiang Shi, Tianheng Cheng, Qian Zhang et al.

ECCV 2024arXiv:2407.04049

citations

#450

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

Yabo Chen, Jiemin Fang, Yuyang Huang et al.

ECCV 2024arXiv:2312.04424

citations

#451

Zero-shot Object Counting with Good Exemplars

Huilin Zhu, Jingling Yuan, Zhengwei Yang et al.

ECCV 2024arXiv:2407.04948

citations

#452

Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities

Xu Zheng, Yuanhuiyi Lyu, LIN WANG

ECCV 2024arXiv:2407.11351

citations

#453

MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos

Yushuo Chen, Zerong Zheng, Zhe Li et al.

ECCV 2024arXiv:2407.08414

citations

#454

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

Zhihang Lin, Mingbao Lin, Meng Zhao et al.

ECCV 2024arXiv:2407.10738

citations

#455

Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting

Ri-Zhao Qiu, Ge Yang, Weijia Zeng et al.

ECCV 2024

citations

#456

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

Heyuan Li, Ce Chen, Tianhao Shi et al.

ECCV 2024arXiv:2404.05680

citations

#457

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.

ECCV 2024arXiv:2407.03575

citations

#458

Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations

Tomáš Chobola, Yu Liu, Hanyi Zhang et al.

ECCV 2024arXiv:2407.12511

citations

#459

Attention Prompting on Image for Large Vision-Language Models

Runpeng Yu, Weihao Yu, Xinchao Wang

ECCV 2024arXiv:2409.17143

citations

#460

Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection

Ting Lei, Shaofeng Yin, Yuxin Peng et al.

ECCV 2024arXiv:2408.02484

citations

#461

GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features

Luc Sträter, Mohammadreza Salehi, Efstratios Gavves et al.

ECCV 2024arXiv:2407.12427

citations

#462

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

Ming Hu, Peng Xia, Lin Wang et al.

ECCV 2024arXiv:2406.07471

citations

#463

Masked Angle-Aware Autoencoder for Remote Sensing Images

Zhihao Li, Biao Hou, Siteng Ma et al.

ECCV 2024arXiv:2408.01946

citations

#464

DreamMotion: Space-Time Self-Similar Score Distillation for Zero-Shot Video Editing

Hyeonho Jeong, Jinho Chang, GEON YEONG PARK et al.

ECCV 2024arXiv:2403.12002

citations

#465

Do text-free diffusion models learn discriminative visual representations?

Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.

ECCV 2024arXiv:2311.17921

citations

#466

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Zhiyuan MA, Yuxiang WEI, Yabin Zhang et al.

ECCV 2024arXiv:2407.02040

citations

#467

MART: MultiscAle Relational Transformer Networks for Multi-agent Trajectory Prediction

Seongju Lee, Junseok Lee, Yeonguk Yu et al.

ECCV 2024arXiv:2407.21635

citations

#468

Tuning-Free Image Customization with Image and Text Guidance

Pengzhi Li, Qiang Nie, Ying Chen et al.

ECCV 2024arXiv:2403.12658

citations

#469

Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

WEI-JER Chang, Francesco Pittaluga, Masayoshi TOMIZUKA et al.

ECCV 2024arXiv:2401.00391

citations

#470

Few-shot Class Incremental Learning with Attention-Aware Self-Adaptive Prompt

Chenxi Liu, Zhenyi Wang, Tianyi Xiong et al.

ECCV 2024arXiv:2403.09857

citations

#471

FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification

Yu Tian, Congcong Wen, Min Shi et al.

ECCV 2024arXiv:2407.08813

citations

#472

StableDrag: Stable Dragging for Point-based Image Editing

Yutao Cui, Xiaotong Zhao, Guozhen Zhang et al.

ECCV 2024arXiv:2403.04437

citations

#473

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection

Huafeng Chen, Pengxu Wei, Guangqian Guo et al.

ECCV 2024arXiv:2408.10760

citations

#474

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Jinke Li, Xiao He, Chonghua Zhou et al.

ECCV 2024arXiv:2405.04299

citations

#475

Multistain Pretraining for Slide Representation Learning in Pathology

Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.

ECCV 2024arXiv:2408.02859

citations

#476

Progressive Pretext Task Learning for Human Trajectory Prediction

Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.

ECCV 2024arXiv:2407.11588

citations

#477

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain et al.

ECCV 2024arXiv:2403.18819

citations

#478

Dolfin: Diffusion Layout Transformers without Autoencoder

Yilin Wang, Zeyuan Chen, Liangjun Zhong et al.

ECCV 2024arXiv:2310.16305

citations

#479

Make-Your-3D: Fast and Consistent Subject-Driven 3D Content Generation

Fangfu Liu, Hanyang Wang, Weiliang Chen et al.

ECCV 2024arXiv:2403.09625

citations

#480

SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting

Richard Shaw, Michal Nazarczuk, Song Jifei et al.

ECCV 2024arXiv:2312.13308

citations

#481

Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators

Yifan Pu, Xia Zhuofan, Jiayi Guo et al.

ECCV 2024arXiv:2408.05710

citations

#482

Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching

Meng Chu, Zhedong Zheng, Wei Ji et al.

ECCV 2024arXiv:2311.12751

citations

#483

Seeing the Unseen: A Frequency Prompt Guided Transformer for Image Restoration

shihao zhou, Jinshan Pan, Jinglei Shi et al.

ECCV 2024arXiv:2404.00288

citations

#484

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Zhengdi Yu, Shaoli Huang, yongkang cheng et al.

ECCV 2024arXiv:2310.20436

citations

#485

Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Agneet Chatterjee, Gabriela Ben Melech Stan, Estelle Guez Aflalo et al.

ECCV 2024arXiv:2404.01197

citations

#486

Prioritized Semantic Learning for Zero-shot Instance Navigation

Xinyu Sun, Lizhao Liu, Hongyan Zhi et al.

ECCV 2024arXiv:2403.11650

citations

#487

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

Fucai Ke, Zhixi Cai, Simindokht Jahangard et al.

ECCV 2024arXiv:2403.12884

citations

#488

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Yizhe Xiong, Hui Chen, Tianxiang Hao et al.

ECCV 2024arXiv:2403.09192

citations

#489

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Chaofeng Chen, Annan Wang, Haoning Wu et al.

ECCV 2024arXiv:2311.15657

citations

#490

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen et al.

ECCV 2024arXiv:2312.03849

citations

#491

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

ZUYAN LIU, Benlin Liu, Jiahui Wang et al.

ECCV 2024arXiv:2407.18121

citations

#492

In Defense of Lazy Visual Grounding for Open-Vocabulary Semantic Segmentation

Dahyun Kang, Minsu Cho

ECCV 2024arXiv:2408.04961

citations

#493

GeoCalib: Learning Single-image Calibration with Geometric Optimization

Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger et al.

ECCV 2024arXiv:2409.06704

citations

#494

MTKD: Multi-Teacher Knowledge Distillation for Image Super-Resolution

Yuxuan Jiang, Chen Feng, Fan Zhang et al.

ECCV 2024arXiv:2404.09571

citations

#495

WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering

Pingyi Chen, Chenglu Zhu, Sunyi Zheng et al.

ECCV 2024arXiv:2407.05603

citations

#496

Enhancing Vectorized Map Perception with Historical Rasterized Maps

Xiaoyu Zhang, Guangwei Liu, Zihao Liu et al.

ECCV 2024arXiv:2409.00620

citations

#497

A Closer Look at GAN Priors: Exploiting Intermediate Features for Enhanced Model Inversion Attacks

Yixiang Qiu, Hao Fang, Hongyao Yu et al.

ECCV 2024arXiv:2407.13863

citations

#498

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Hu Zhang, xu jianhua, Tao Tang et al.

ECCV 2024arXiv:2312.08876

citations

#499

LLMGA: Multimodal Large Language Model based Generation Assistant

Bin Xia, Shiyin Wang, Yingfan Tao et al.

ECCV 2024arXiv:2311.16500

citations

#500

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving

Zhenghao Peng, Wenjie Luo, Yiren Lu et al.

ECCV 2024arXiv:2409.18343

citations

#501

SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow

Yuanzhi Zhu, Xingchao Liu, Qiang Liu

ECCV 2024arXiv:2407.12718

citations

#502

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning

Yan Li, Weiwei Guo, Xue Yang et al.

ECCV 2024arXiv:2311.11646

citations

#503

TrojVLM: Backdoor Attack Against Vision Language Models

Weimin Lyu, Lu Pang, Tengfei Ma et al.

ECCV 2024arXiv:2409.19232

citations

#504

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

Tim Broedermann, David Brüggemann, Christos Sakaridis et al.

ECCV 2024arXiv:2401.12761

citations

#505

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.

ECCV 2024arXiv:2407.10957

citations

#506

SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection

Hongcheng Zhang, Liu Liang, Pengxin Zeng et al.

ECCV 2024arXiv:2403.07284

citations

#507

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Sehwan Choi, Jun Won Choi, JUNGHO KIM et al.

ECCV 2024arXiv:2407.13517

citations

#508

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding

Talfan Evans, Shreya Pathak, Hamza Merzic et al.

ECCV 2024arXiv:2312.05328

citations

#509

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

Yuanpeng Tu, Boshen Zhang, Liang Liu et al.

ECCV 2024arXiv:2401.03145

citations

#510

Isomorphic Pruning for Vision Models

Gongfan Fang, Xinyin Ma, Michael Bi Mi et al.

ECCV 2024arXiv:2407.04616

citations

#511

Unleashing the Power of Prompt-driven Nucleus Instance Segmentation

Zhongyi Shui, Yunlong Zhang, Kai Yao et al.

ECCV 2024arXiv:2311.15939

citations

#512

Causality-inspired Discriminative Feature Learning in Triple Domains for Gait Recognition

Haijun Xiong, Bin Feng, Xinggang Wang et al.

ECCV 2024arXiv:2407.12519

citations

#513

EgoPoser: Robust Real-Time Egocentric Pose Estimation from Sparse and Intermittent Observations Everywhere

Jiaxi Jiang, Paul Streli, Manuel Meier et al.

ECCV 2024arXiv:2308.06493

citations

#514

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models

Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.

ECCV 2024arXiv:2401.12978

citations

#515

AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer

Zhuguanyu Wu, Jiaxin Chen, Hanwen Zhong et al.

ECCV 2024arXiv:2407.12951

citations

#516

Cascade Prompt Learning for Visual-Language Model Adaptation

Ge Wu, Xin Zhang, Zheng Li et al.

ECCV 2024

citations

#517

Score Distillation Sampling with Learned Manifold Corrective

Thiemo Alldieck, Nikos Kolotouros, Cristian Sminchisescu

ECCV 2024arXiv:2401.05293

citations

#518

DyFADet: Dynamic Feature Aggregation for Temporal Action Detection

Le Yang, Ziwei Zheng, Yizeng Han et al.

ECCV 2024arXiv:2407.03197

citations

#519

LISO: Lidar-only Self-Supervised 3D Object Detection

Stefan Baur, Frank Moosmann, Andreas Geiger

ECCV 2024arXiv:2403.07071

citations

#520

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024arXiv:2407.08476

citations

#521

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

Guohao Sun, Can Qin, JIAMINAN WANG et al.

ECCV 2024arXiv:2403.11299

citations

#522

Semantic Residual Prompts for Continual Learning

Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.

ECCV 2024arXiv:2403.06870

citations

#523

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.

ECCV 2024arXiv:2407.17850

citations

#524

Diffusion Soup: Model Merging for Text-to-Image Diffusion Models

Benjamin J Biggs, Arjun Seshadri, Yang Zou et al.

ECCV 2024arXiv:2406.08431

citations

#525

OccGen: Generative Multi-modal 3D Occupancy Prediction for Autonomous Driving

Guoqing Wang, Zhongdao Wang, Pin Tang et al.

ECCV 2024arXiv:2404.15014

citations

#526

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024arXiv:2407.07582

citations

#527

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.

ECCV 2024arXiv:2408.09702

citations

#528

PreciseControl: Enhancing Text-To-Image Diffusion Models with Fine-Grained Attribute Control

Rishubh Parihar, Sachidanand VS, Sabariswaran Mani et al.

ECCV 2024arXiv:2408.05083

citations

#529

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

ECCV 2024arXiv:2404.05052

citations

#530

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ECCV 2024arXiv:2312.11897

citations

#531

Fast Diffusion-Based Counterfactuals for Shortcut Removal and Generation

Nina Weng, Paraskevas Pegios, Eike Petersen et al.

ECCV 2024arXiv:2312.14223

citations

#532

G3R: Gradient Guided Generalizable Reconstruction

Yun Chen, Jingkang Wang, Ze Yang et al.

ECCV 2024arXiv:2409.19405

citations

#533

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li et al.

ECCV 2024arXiv:2403.12574

citations

#534

RealViformer: Investigating Attention for Real-World Video Super-Resolution

Yuehan Zhang, Angela Yao

ECCV 2024arXiv:2407.13987

citations

#535

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

Jie Yang, Xuesong Niu, Nan Jiang et al.

ECCV 2024arXiv:2407.12435

citations

#536

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024arXiv:2401.05735

citations

#537

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ECCV 2024arXiv:2409.02543

citations

#538

Region-Adaptive Transform with Segmentation Prior for Image Compression

Yuxi Liu, Wenhan Yang, Huihui Bai et al.

ECCV 2024arXiv:2403.00628

citations

#539

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Jianwei Zhao, Xin Li, Fan Yang et al.

ECCV 2024arXiv:2407.13133

citations

#540

Defect Spectrum: A Granular Look of Large-scale Defect Datasets with Rich Semantics

Shuai Yang, ZhiFei Chen, Pengguang Chen et al.

ECCV 2024arXiv:2310.17316

citations

#541

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

Sixiang Chen, Tian Ye, Kai Zhang et al.

ECCV 2024arXiv:2409.15739

citations

#542

HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models

Shen Zhang, Zhaowei CHEN, Zhenyu Zhao et al.

ECCV 2024arXiv:2311.17528

citations

#543

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

ECCV 2024arXiv:2312.06573

citations

#544

EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

Yuanming Li, Wei-Jin Huang, An-Lan Wang et al.

ECCV 2024arXiv:2406.08877

citations

#545

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

ECCV 2024arXiv:2403.10153

citations

#546

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Sanjoy Chowdhury, Sayan Nag, Subhrajyoti Dasgupta et al.

ECCV 2024arXiv:2407.01851

citations

#547

DataDream: Few-shot Guided Dataset Generation

Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.

ECCV 2024arXiv:2407.10910

citations

#548

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024arXiv:2403.09176

citations

#549

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao et al.

ECCV 2024arXiv:2306.17010

citations

#550

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ECCV 2024arXiv:2311.17944

citations

#551

Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation

Juncheng Ma, Peiwen Sun, Yaoting Wang et al.

ECCV 2024arXiv:2407.11820

citations

#552

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

ECCV 2024arXiv:2409.18049

citations

#553

GiT: Towards Generalist Vision Transformer through Universal Language Interface

Haiyang Wang, Hao Tang, Li Jiang et al.

ECCV 2024arXiv:2403.09394

citations

#554

An Incremental Unified Framework for Small Defect Inspection

Jiaqi Tang, Hao Lu, Xiaogang Xu et al.

ECCV 2024arXiv:2312.08917

citations

#555

Online Zero-Shot Classification with CLIP

Qi Qian, JUHUA HU

ECCV 2024arXiv:2408.13320

citations

#556

Factorized Diffusion: Perceptual Illusions by Noise Decomposition

Daniel Geng, Inbum Park, Andrew Owens

ECCV 2024arXiv:2404.11615

citations

#557

PromptFusion: Decoupling Stability and Plasticity for Continual Learning

Haoran Chen, Zuxuan Wu, Xintong Han et al.

ECCV 2024arXiv:2303.07223

citations

#558

ViLA: Efficient Video-Language Alignment for Video Question Answering

Xijun Wang, Junbang Liang, Chun-Kai Wang et al.

ECCV 2024arXiv:2312.08367

citations

#559

PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors

Tianyuan Yuan, Mao Yucheng, Jiawei Yang et al.

ECCV 2024arXiv:2403.09079

citations

#560

IRGen: Generative Modeling for Image Retrieval

Yidan Zhang, Ting Zhang, DONG CHEN et al.

ECCV 2024arXiv:2303.10126

citations

#561

DIFFender: Diffusion-Based Adversarial Defense against Patch Attacks

Caixin Kang, Yinpeng Dong, Zhengyi Wang et al.

ECCV 2024arXiv:2306.09124

citations

#562

Robust Calibration of Large Vision-Language Adapters

Balamurali Murugesan, Julio Silva-Rodríguez, Ismail Ben Ayed et al.

ECCV 2024arXiv:2407.13588

citations

#563

SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic

Kashyap Chitta, Daniel Dauner, Andreas Geiger

ECCV 2024arXiv:2403.17933

citations

#564

Collaborative Control for Geometry-Conditioned PBR Image Generation

Shimon Vainer, Mark Boss, Mathias Parger et al.

ECCV 2024arXiv:2402.05919

citations

#565

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.

ECCV 2024arXiv:2312.09231

citations

#566

Continuous Memory Representation for Anomaly Detection

Joo Chan Lee, Taejune Kim, Eunbyung Park et al.

ECCV 2024arXiv:2402.18293

citations

#567

Eliminating Feature Ambiguity for Few-Shot Segmentation

Qianxiong Xu, Guosheng Lin, Chen Change Loy et al.

ECCV 2024arXiv:2407.09842

citations

#568

OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations

Yiming Zuo, Jia Deng

ECCV 2024arXiv:2406.11711

citations

#569

RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez et al.

ECCV 2024arXiv:2312.12865

citations

#570

VisionTrap: Vision-Augmented Trajectory Prediction Guided by Textual Descriptions

Seokha Moon, Hyun Woo, Hongbeen Park et al.

ECCV 2024arXiv:2407.12345

citations

#571

AddressCLIP: Empowering Vision-Language Models for City-wide Image Address Localization

Shixiong Xu, Chenghao Zhang, Lubin Fan et al.

ECCV 2024arXiv:2407.08156

citations

#572

SEED: A Simple and Effective 3D DETR in Point Clouds

Zhe Liu, Jinghua Hou, Xiaoqing Ye et al.

ECCV 2024arXiv:2407.10749

citations

#573

SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery

Sarah Rastegar, Mohammadreza Salehi, Yuki M Asano et al.

ECCV 2024arXiv:2408.14371

citations

#574

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024arXiv:2403.14617

citations

#575

Robust-Wide: Robust Watermarking against Instruction-driven Image Editing

Runyi Hu, Jie Zhang, Ting Xu et al.

ECCV 2024arXiv:2402.12688

citations

#576

3D Hand Pose Estimation in Everyday Egocentric Images

Aditya Prakash, Ruisen Tu, Matthew Chang et al.

ECCV 2024arXiv:2312.06583

citations

#577

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.

ECCV 2024arXiv:2404.01300

citations

#578

DIM: Dyadic Interaction Modeling for Social Behavior Generation

Minh Tran, Di Chang, Maksim Siniukov et al.

ECCV 2024

citations

#579

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang et al.

ECCV 2024arXiv:2404.01700

citations

#580

Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer

Yang Wu, Kaihua Zhang, Jianjun Qian et al.

ECCV 2024arXiv:2407.19628

citations

#581

Visible and Clear: Finding Tiny Objects in Difference Map

Bing Cao, Haiyu Yao, Pengfei Zhu et al.

ECCV 2024arXiv:2405.11276

citations

#582

One-Shot Diffusion Mimicker for Handwritten Text Generation

Gang Dai, Yifan Zhang, Quhui Ke et al.

ECCV 2024arXiv:2409.04004

citations

#583

FedRA: A Random Allocation Strategy for Federated Tuning to Unleash the Power of Heterogeneous Clients

Shangchao Su, Bin Li, Xiangyang Xue

ECCV 2024arXiv:2311.11227

citations

#584

Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration

Chu Jie Qin, Ruiqi Wu, Zikun Liu et al.

ECCV 2024arXiv:2409.19403

citations

#585

Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images

Jacopo Bonato, Marco Cotogni, Luigi Sabetta

ECCV 2024arXiv:2404.12922

citations

#586

A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars

Ronglai Zuo, Fangyun Wei, Zenggui Chen et al.

ECCV 2024arXiv:2401.04730

citations

#587

Match-Stereo-Videos: Bidirectional Alignment for Consistent Dynamic Stereo Matching

Junpeng Jing, Ye Mao, Krystian Mikolajczyk

ECCV 2024arXiv:2403.10755

citations

#588

Textual Query-Driven Mask Transformer for Domain Generalized Segmentation

Byeonghyun Pak, Byeongju Woo, Sunghwan Kim et al.

ECCV 2024arXiv:2407.09033

citations

#589

AMEGO: Active Memory from long EGOcentric videos

Gabriele Goletto, Tushar Nagarajan, Giuseppe Averta et al.

ECCV 2024arXiv:2409.10917

citations

#590

ZeST: Zero-Shot Material Transfer from a Single Image

Ta-Ying Cheng, Prafull Sharma, Andrew Markham et al.

ECCV 2024arXiv:2404.06425

citations

#591

Navigation Instruction Generation with BEV Perception and Large Language Models

Sheng Fan, Rui Liu, Wenguan Wang et al.

ECCV 2024arXiv:2407.15087

citations

#592

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding

Quan Kong, Yuki Kawana, Rajat Saini et al.

ECCV 2024arXiv:2407.15350

citations

#593

Few-Shot Anomaly-Driven Generation for Anomaly Classification and Segmentation

Guan Gui, Bin-Bin Gao, Jun Liu et al.

ECCV 2024arXiv:2505.09263

citations

#594

Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model

Qi Song, Ziyuan Luo, Ka Chun Cheung et al.

ECCV 2024arXiv:2407.07735

citations

#595

Training-free Video Temporal Grounding using Large-scale Pre-trained Models

Minghang Zheng, Xinhao Cai, Qingchao Chen et al.

ECCV 2024arXiv:2408.16219

citations

#596

LayoutFlow: Flow Matching for Layout Generation

Julian Jorge Andrade Guerreiro, Naoto Inoue, Kento Masui et al.

ECCV 2024arXiv:2403.18187

citations

#597

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

Ruoxi Chen, Haibo Jin, Yixin Liu et al.

ECCV 2024arXiv:2311.12066

citations

#598

CadVLM: Bridging Language and Vision in the Generation of Parametric CAD Sketches

Sifan Wu, Amir Hosein Khasahmadi, Mor Katz et al.

ECCV 2024arXiv:2409.17457

citations

#599

SAGS: Structure-Aware 3D Gaussian Splatting

Evangelos Ververas, Rolandos Alexandros Potamias, Song Jifei et al.

ECCV 2024arXiv:2404.19149

citations

#600

HERGen: Elevating Radiology Report Generation with Longitudinal Data

Fuying Wang, Shenghui Du, Lequan Yu

ECCV 2024arXiv:2407.15158

citations

← Previous

1 2 3 4 5...12