Most Cited ECCV "linear time-invariant systems" Papers

2,387 papers found • Page 2 of 12

#201

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Tong Shao, Zhuotao Tian, Hang Zhao et al.

ECCV 2024posterarXiv:2407.08268
44
citations
#202

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh, Adam Tonderski, Joakim Johnander et al.

ECCV 2024posterarXiv:2404.07762
44
citations
#203

DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model

Li Xiaofan, Zhang Yifu, Xiaoqing Ye

ECCV 2024poster
44
citations
#204

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Qianjiang Hu, Zhimin Zhang, Wei Hu

ECCV 2024posterarXiv:2403.10094
44
citations
#205

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang et al.

ECCV 2024poster
43
citations
#206

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model

Yulin Luo, Ruichuan An, Bocheng Zou et al.

ECCV 2024posterarXiv:2405.02363
43
citations
#207

ParCo: Part-Coordinating Text-to-Motion Synthesis

Qiran Zou, Shangyuan Yuan, Shian Du et al.

ECCV 2024posterarXiv:2403.18512
43
citations
#208

Diffusion Reward: Learning Rewards via Conditional Video Diffusion

Tao Huang, Guangqi Jiang, Yanjie Ze et al.

ECCV 2024posterarXiv:2312.14134
43
citations
#209

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024posterarXiv:2403.10893
42
citations
#210

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024posterarXiv:2403.19435
42
citations
#211

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024posterarXiv:2407.14143
41
citations
#212

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Letian Huang, Jiayang Bai, Jie Guo et al.

ECCV 2024posterarXiv:2402.00752
41
citations
#213

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024posterarXiv:2311.12897
41
citations
#214

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024posterarXiv:2401.06825
40
citations
#215

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024posterarXiv:2401.09112
40
citations
#216

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024posterarXiv:2403.19588
40
citations
#217

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024posterarXiv:2312.07381
40
citations
#218

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024posterarXiv:2403.19589
40
citations
#219

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024posterarXiv:2403.18118
40
citations
#220

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024posterarXiv:2403.10050
40
citations
#221

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ECCV 2024posterarXiv:2311.09999
40
citations
#222

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

Chao Xu, Ang Li, Linghao Chen et al.

ECCV 2024posterarXiv:2408.10195
39
citations
#223

GalLop: Learning global and local prompts for vision-language models

Marc Lafon, Elias Ramzi, Clément Rambour et al.

ECCV 2024posterarXiv:2407.01400
39
citations
#224

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Mengting Chen, Xi Chen, Zhonghua Zhai et al.

ECCV 2024posterarXiv:2403.12965
38
citations
#225

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024posterarXiv:2403.13129
38
citations
#226

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

Matteo Bortolon, Theodoros Tsesmelis, Stuart James et al.

ECCV 2024posterarXiv:2407.15484
37
citations
#227

SMooDi: Stylized Motion Diffusion Model

Lei Zhong, Yiming Xie, Varun Jampani et al.

ECCV 2024posterarXiv:2407.12783
37
citations
#228

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024posterarXiv:2312.00937
37
citations
#229

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024posterarXiv:2407.13761
37
citations
#230

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao Tianchen, Xuefei Ning, Tongcheng Fang et al.

ECCV 2024posterarXiv:2405.17873
36
citations
#231

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024posterarXiv:2409.09756
36
citations
#232

Pyramid Diffusion for Fine 3D Large Scene Generation

Yuheng Liu, Xinke Li, Xueting Li et al.

ECCV 2024posterarXiv:2311.12085
36
citations
#233

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024posterarXiv:2403.18274
36
citations
#234

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024posterarXiv:2311.13627
36
citations
#235

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

ECCV 2024posterarXiv:2407.10862
36
citations
#236

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Tan, Muhammad Ferjad Naeem et al.

ECCV 2024posterarXiv:2311.16241
36
citations
#237

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed abdelrahman et al.

ECCV 2024posterarXiv:2407.12679
36
citations
#238

DragVideo: Interactive Drag-style Video Editing

Yufan Deng, Ruida Wang, Yuhao ZHANG et al.

ECCV 2024posterarXiv:2312.02216
36
citations
#239

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

Junjie Guo, Chenqiang Gao, Fangcen liu et al.

ECCV 2024posterarXiv:2403.00326
36
citations
#240

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024posterarXiv:2407.13519
35
citations
#241

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024posterarXiv:2403.05023
35
citations
#242

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024posterarXiv:2311.16917
35
citations
#243

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024posterarXiv:2405.15763
35
citations
#244

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Zhijian Huang, Tao Tang, Shaoxiang Chen et al.

ECCV 2024posterarXiv:2408.13890
35
citations
#245

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024posterarXiv:2402.03310
35
citations
#246

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024posterarXiv:2312.09128
35
citations
#247

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

Zhen Zhao, Zicheng Wang, Dian Yu et al.

ECCV 2024posterarXiv:2311.17325
35
citations
#248

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024posterarXiv:2311.16567
35
citations
#249

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024posterarXiv:2310.06744
34
citations
#250

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024posterarXiv:2312.05295
34
citations
#251

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024poster
34
citations
#252

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024posterarXiv:2409.14444
34
citations
#253

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024posterarXiv:2305.14428
34
citations
#254

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024posterarXiv:2407.01702
34
citations
#255

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024posterarXiv:2403.11781
34
citations
#256

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024posterarXiv:2402.03634
34
citations
#257

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024posterarXiv:2407.12777
34
citations
#258

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024posterarXiv:2311.07152
34
citations
#259

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024posterarXiv:2311.11261
34
citations
#260

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024posterarXiv:2404.03590
34
citations
#261

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024posterarXiv:2403.07508
33
citations
#262

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024posterarXiv:2408.14176
33
citations
#263

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024posterarXiv:2403.05049
33
citations
#264

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024posterarXiv:2406.14556
33
citations
#265

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024posterarXiv:2403.18211
33
citations
#266

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024posterarXiv:2403.05659
33
citations
#267

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

Xinzhou Wang, Yikai Wang, junliang ye et al.

ECCV 2024posterarXiv:2312.03795
33
citations
#268

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

Chenxin Li, Xinyu Liu, Cheng Wang et al.

ECCV 2024posterarXiv:2407.05540
33
citations
#269

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024posterarXiv:2404.13679
33
citations
#270

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024posterarXiv:2403.04697
33
citations
#271

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Young Kyun Jang, Dat B Huynh, Ashish Shah et al.

ECCV 2024posterarXiv:2405.00571
32
citations
#272

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024posterarXiv:2312.08372
32
citations
#273

Exact Diffusion Inversion via Bidirectional Integration Approximation

Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn

ECCV 2024poster
32
citations
#274

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Siyu Jiao, hongguang Zhu, Yunchao Wei et al.

ECCV 2024posterarXiv:2408.00744
32
citations
#275

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024posterarXiv:2408.05475
31
citations
#276

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Ye Liu, Jixuan He, Wanhua Li et al.

ECCV 2024posterarXiv:2404.00801
31
citations
#277

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.

ECCV 2024posterarXiv:2404.03010
31
citations
#278

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024posterarXiv:2408.05868
31
citations
#279

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.

ECCV 2024posterarXiv:2407.10387
31
citations
#280

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024posterarXiv:2403.12445
31
citations
#281

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.

ECCV 2024posterarXiv:2407.20337
31
citations
#282

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024posterarXiv:2407.02077
31
citations
#283

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024posterarXiv:2310.04900
31
citations
#284

Soft Prompt Generation for Domain Generalization

Shuanghao Bai, Yuedi Zhang, Wanqi Zhou et al.

ECCV 2024posterarXiv:2404.19286
30
citations
#285

Hierarchical Gaussian Mixture Normalizing Flow Modeling for Unified Anomaly Detection

Xincheng Yao, Ruoqi Li, Zefeng Qian et al.

ECCV 2024posterarXiv:2403.13349
30
citations
#286

Gaussian Splatting on the Move: Blur and Rolling Shutter Compensation for Natural Camera Motion

Otto Seiskari, Jerry Ylilammi, Valtteri Kaatrasalo et al.

ECCV 2024posterarXiv:2403.13327
30
citations
#287

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024posterarXiv:2407.09983
30
citations
#288

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Yufei Zhan, Yousong Zhu, Zhiyang Chen et al.

ECCV 2024posterarXiv:2311.14552
30
citations
#289

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Jingyi Lu, Xinghui Li, Kai Han

ECCV 2024posterarXiv:2407.18247
30
citations
#290

MVPGS: Excavating Multi-view Priors for Gaussian Splatting from Sparse Input Views

Wangze Xu, Huachen Gao, Shihe Shen et al.

ECCV 2024posterarXiv:2409.14316
30
citations
#291

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

Muhammad Jehanzeb Mirza, Leonid Karlinsky, Wei Lin et al.

ECCV 2024posterarXiv:2403.11755
30
citations
#292

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

XINYUAN GAO, Songlin Dong, Yuhang He et al.

ECCV 2024posterarXiv:2407.10281
30
citations
#293

Denoising Vision Transformers

Jiawei Yang, Katie Luo, Jiefeng Li et al.

ECCV 2024posterarXiv:2401.02957
30
citations
#294

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross et al.

ECCV 2024posterarXiv:2404.08580
30
citations
#295

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation

Jaejung Seol, Seojun Kim, Jaejun Yoo

ECCV 2024posterarXiv:2404.00995
30
citations
#296

N2F2: Hierarchical Scene Understanding with Nested Neural Feature Fields

Yash Bhalgat, Iro Laina, Joao F Henriques et al.

ECCV 2024posterarXiv:2403.10997
30
citations
#297

Four Ways to Improve Verbo-visual Fusion for Dense 3D Visual Grounding

Ozan Unal, Christos Sakaridis, Suman Saha et al.

ECCV 2024posterarXiv:2309.04561
29
citations
#298

Dataset Distillation by Automatic Training Trajectories

Dai Liu, Jindong Gu, Hu Cao et al.

ECCV 2024posterarXiv:2407.14245
29
citations
#299

Boosting the Power of Small Multimodal Reasoning Models to Match Larger Models with Self-Consistency Training

Cheng Tan, Jingxuan Wei, Zhangyang Gao et al.

ECCV 2024posterarXiv:2311.14109
29
citations
#300

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024posterarXiv:2407.16698
29
citations
#301

View Selection for 3D Captioning via Diffusion Ranking

Tiange Luo, Justin Johnson, Honglak Lee

ECCV 2024posterarXiv:2404.07984
29
citations
#302

Nuvo: Neural UV Mapping for Unruly 3D Representations

Pratul Srinivasan, Stephan J Garbin, Dor Verbin et al.

ECCV 2024posterarXiv:2312.05283
29
citations
#303

Language-Driven Physics-Based Scene Synthesis and Editing via Feature Splatting

Ri-Zhao Qiu, Ge Yang, Weijia Zeng et al.

ECCV 2024poster
28
citations
#304

Video Editing via Factorized Diffusion Distillation

Uriel Singer, Amit Zohar, Yuval Kirstain et al.

ECCV 2024posterarXiv:2403.09334
28
citations
#305

VEGS: View Extrapolation of Urban Scenes in 3D Gaussian Splatting using Learned Priors

Sungwon Hwang, Min-Jung Kim, Taewoong Kang et al.

ECCV 2024posterarXiv:2407.02945
28
citations
#306

WiMANS: A Benchmark Dataset for WiFi-based Multi-user Activity Sensing

Shuokang Huang, Kaihan Li, Di You et al.

ECCV 2024posterarXiv:2402.09430
28
citations
#307

EventBind: Learning a Unified Representation to Bind Them All for Event-based Open-world Understanding

jiazhou zhou, Xu Zheng, Yuanhuiyi Lyu et al.

ECCV 2024posterarXiv:2308.03135
28
citations
#308

HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning

Zhecan Wang, Garrett Bingham, Adams Wei Yu et al.

ECCV 2024posterarXiv:2407.15680
28
citations
#309

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation

Zhihang Lin, Mingbao Lin, Meng Zhao et al.

ECCV 2024posterarXiv:2407.10738
28
citations
#310

Cascade-Zero123: One Image to Highly Consistent 3D with Self-Prompted Nearby Views

Yabo Chen, Jiemin Fang, Yuyang Huang et al.

ECCV 2024posterarXiv:2312.04424
28
citations
#311

Learning Modality-agnostic Representation for Semantic Segmentation from Any Modalities

Xu Zheng, Yuanhuiyi Lyu, LIN WANG

ECCV 2024posterarXiv:2407.11351
28
citations
#312

CAT-SAM: Conditional Tuning for Few-Shot Adaptation of Segment Anything Model

Aoran Xiao, Weihao Xuan, Heli Qi et al.

ECCV 2024posterarXiv:2402.03631
28
citations
#313

Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching

Yichen Li, Wenchao Xu, Haozhao Wang et al.

ECCV 2024posterarXiv:2407.05005
28
citations
#314

MeshAvatar: Learning High-quality Triangular Human Avatars from Multi-view Videos

Yushuo Chen, Zerong Zheng, Zhe Li et al.

ECCV 2024posterarXiv:2407.08414
28
citations
#315

WHAC: World-grounded Humans and Cameras

Wanqi Yin, Zhongang Cai, Chen Wei et al.

ECCV 2024posterarXiv:2403.12959
28
citations
#316

PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving

Zhili Chen, Maosheng Ye, Shuangjie Xu et al.

ECCV 2024posterarXiv:2311.08100
28
citations
#317

Attention Prompting on Image for Large Vision-Language Models

Runpeng Yu, Weihao Yu, Xinchao Wang

ECCV 2024posterarXiv:2409.17143
28
citations
#318

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding

Ming Hu, Peng Xia, Lin Wang et al.

ECCV 2024posterarXiv:2406.07471
27
citations
#319

FairDomain: Achieving Fairness in Cross-Domain Medical Image Segmentation and Classification

Yu Tian, Congcong Wen, Min Shi et al.

ECCV 2024posterarXiv:2407.08813
27
citations
#320

Image Compression for Machine and Human Vision With Spatial-Frequency Adaptation

han li, Shaohui Li, Shuangrui Ding et al.

ECCV 2024posterarXiv:2407.09853
27
citations
#321

Zero-shot Object Counting with Good Exemplars

Huilin Zhu, Jingling Yuan, Zhengwei Yang et al.

ECCV 2024posterarXiv:2407.04948
27
citations
#322

Multistain Pretraining for Slide Representation Learning in Pathology

Guillaume Jaume, Anurag J Vaidya, Andrew Zhang et al.

ECCV 2024posterarXiv:2408.02859
27
citations
#323

Learning Unified Reference Representation for Unsupervised Multi-class Anomaly Detection

Liren He, Zhengkai Jiang, Jinlong Peng et al.

ECCV 2024posterarXiv:2403.11561
27
citations
#324

GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features

Luc Sträter, Mohammadreza Salehi, Efstratios Gavves et al.

ECCV 2024posterarXiv:2407.12427
27
citations
#325

Progressive Pretext Task Learning for Human Trajectory Prediction

Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.

ECCV 2024posterarXiv:2407.11588
27
citations
#326

The Nerfect Match: Exploring NeRF Features for Visual Localization

Qunjie Zhou, Maxim Maximov, Or Litany et al.

ECCV 2024posterarXiv:2403.09577
27
citations
#327

SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation

Heyuan Li, Ce Chen, Tianhao Shi et al.

ECCV 2024posterarXiv:2404.05680
26
citations
#328

Do text-free diffusion models learn discriminative visual representations?

Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.

ECCV 2024posterarXiv:2311.17921
26
citations
#329

Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries

WEI-JER Chang, Francesco Pittaluga, Masayoshi TOMIZUKA et al.

ECCV 2024posterarXiv:2401.00391
26
citations
#330

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Baijiong Lin, Weisen Jiang, Pengguang Chen et al.

ECCV 2024posterarXiv:2407.02228
26
citations
#331

Trackastra: Transformer-based cell tracking for live-cell microscopy

Benjamin Gallusser, Weigert Martin

ECCV 2024posterarXiv:2405.15700
26
citations
#332

UMBRAE: Unified Multimodal Brain Decoding

Weihao Xia, Raoul de Charette, Cengiz Oztireli et al.

ECCV 2024posterarXiv:2404.07202
26
citations
#333

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Zhiyuan MA, Yuxiang WEI, Yabin Zhang et al.

ECCV 2024posterarXiv:2407.02040
26
citations
#334

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Yizhe Xiong, Hui Chen, Tianxiang Hao et al.

ECCV 2024posterarXiv:2403.09192
26
citations
#335

Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

Chaofeng Chen, Annan Wang, Haoning Wu et al.

ECCV 2024posterarXiv:2311.15657
26
citations
#336

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Jinke Li, Xiao He, Chonghua Zhou et al.

ECCV 2024posterarXiv:2405.04299
26
citations
#337

I-MedSAM: Implicit Medical Image Segmentation with Segment Anything

Xiaobao Wei, Jiajun Cao, Yizhu Jin et al.

ECCV 2024posterarXiv:2311.17081
26
citations
#338

Dolfin: Diffusion Layout Transformers without Autoencoder

Yilin Wang, Zeyuan Chen, Liangjun Zhong et al.

ECCV 2024posterarXiv:2310.16305
26
citations
#339

T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models

Zhongqi Wang, Jie Zhang, Shiguang Shan et al.

ECCV 2024posterarXiv:2407.04215
26
citations
#340

Beyond the Contact: Discovering Comprehensive Affordance for 3D Objects from Pre-trained 2D Diffusion Models

Hyeonwoo Kim, Sookwan Han, Patrick Kwon et al.

ECCV 2024posterarXiv:2401.12978
25
citations
#341

SWinGS: Sliding Windows for Dynamic 3D Gaussian Splatting

Richard Shaw, Michal Nazarczuk, Song Jifei et al.

ECCV 2024posterarXiv:2312.13308
25
citations
#342

Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection

Ting Lei, Shaofeng Yin, Yuxin Peng et al.

ECCV 2024posterarXiv:2408.02484
25
citations
#343

Mask2Map: Vectorized HD Map Construction Using Bird's Eye View Segmentation Masks

Sehwan Choi, Jun Won Choi, JUNGHO KIM et al.

ECCV 2024posterarXiv:2407.13517
25
citations
#344

LLMGA: Multimodal Large Language Model based Generation Assistant

Bin Xia, Shiyin Wang, Yingfan Tao et al.

ECCV 2024posterarXiv:2311.16500
25
citations
#345

Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding

Talfan Evans, Shreya Pathak, Hamza Merzic et al.

ECCV 2024posterarXiv:2312.05328
25
citations
#346

SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection

Huafeng Chen, Pengxu Wei, Guangqian Guo et al.

ECCV 2024posterarXiv:2408.10760
25
citations
#347

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

ZUYAN LIU, Benlin Liu, Jiahui Wang et al.

ECCV 2024posterarXiv:2407.18121
25
citations
#348

Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching

Meng Chu, Zhedong Zheng, Wei Ji et al.

ECCV 2024posterarXiv:2311.12751
25
citations
#349

SignAvatars: A Large-scale 3D Sign Language Holistic Motion Dataset and Benchmark

Zhengdi Yu, Shaoli Huang, yongkang cheng et al.

ECCV 2024posterarXiv:2310.20436
25
citations
#350

Cascade Prompt Learning for Visual-Language Model Adaptation

Ge Wu, Xin Zhang, Zheng Li et al.

ECCV 2024poster
24
citations
#351

LISO: Lidar-only Self-Supervised 3D Object Detection

Stefan Baur, Frank Moosmann, Andreas Geiger

ECCV 2024posterarXiv:2403.07071
24
citations
#352

WSI-VQA: Interpreting Whole Slide Images by Generative Visual Question Answering

Pingyi Chen, Chenglu Zhu, Sunyi Zheng et al.

ECCV 2024posterarXiv:2407.05603
24
citations
#353

EAS-SNN: End-to-End Adaptive Sampling and Representation for Event-based Detection with Recurrent Spiking Neural Networks

Ziming Wang, Ziling Wang, Huaning Li et al.

ECCV 2024posterarXiv:2403.12574
24
citations
#354

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.

ECCV 2024posterarXiv:2407.03575
24
citations
#355

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

Yuanpeng Tu, Boshen Zhang, Liang Liu et al.

ECCV 2024posterarXiv:2401.03145
24
citations
#356

Fast Context-Based Low-Light Image Enhancement via Neural Implicit Representations

Tomáš Chobola, Yu Liu, Hanyi Zhang et al.

ECCV 2024posterarXiv:2407.12511
24
citations
#357

Photorealistic Object Insertion with Diffusion-Guided Inverse Rendering

Ruofan Liang, Zan Gojcic, Merlin Nimier-David et al.

ECCV 2024posterarXiv:2408.09702
24
citations
#358

Text-Conditioned Resampler For Long Form Video Understanding

Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.

ECCV 2024posterarXiv:2312.11897
24
citations
#359

FlexiEdit: Frequency-Aware Latent Refinement for Enhanced Non-Rigid Editing

Gwanhyeong Koo, Sunjae Yoon, Ji Woo Hong et al.

ECCV 2024posterarXiv:2407.17850
24
citations
#360

Semantic Residual Prompts for Continual Learning

Martin Menabue, Emanuele Frascaroli, Matteo Boschini et al.

ECCV 2024posterarXiv:2403.06870
24
citations
#361

MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty

Tim Broedermann, David Brüggemann, Christos Sakaridis et al.

ECCV 2024posterarXiv:2401.12761
24
citations
#362

OpenSight: A Simple Open-Vocabulary Framework for LiDAR-Based Object Detection

Hu Zhang, xu jianhua, Tao Tang et al.

ECCV 2024posterarXiv:2312.08876
24
citations
#363

Enhancing Vectorized Map Perception with Historical Rasterized Maps

Xiaoyu Zhang, Guangwei Liu, Zihao Liu et al.

ECCV 2024posterarXiv:2409.00620
24
citations
#364

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Yaoting Wang, Peiwen Sun, Dongzhan Zhou et al.

ECCV 2024posterarXiv:2407.10957
23
citations
#365

TIP: Tabular-Image Pre-training for Multimodal Classification with Incomplete Data

Siyi Du, Shaoming Zheng, Yinsong Wang et al.

ECCV 2024posterarXiv:2407.07582
23
citations
#366

VideoMamba: Spatio-Temporal Selective State Space Model

Jinyoung Park, Hee-Seon Kim, Kangwook Ko et al.

ECCV 2024posterarXiv:2407.08476
23
citations
#367

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

Bolin Lai, Xiaoliang Dai, Lawrence Chen et al.

ECCV 2024posterarXiv:2312.03849
23
citations
#368

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving

Zhenghao Peng, Wenjie Luo, Yiren Lu et al.

ECCV 2024posterarXiv:2409.18343
23
citations
#369

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Wen Li, Muyuan Fang, Cheng Zou et al.

ECCV 2024posterarXiv:2409.02543
23
citations
#370

SQ-LLaVA: Self-Questioning for Large Vision-Language Assistant

Guohao Sun, Can Qin, JIAMINAN WANG et al.

ECCV 2024posterarXiv:2403.11299
23
citations
#371

Improving Medical Multi-modal Contrastive Learning with Expert Annotations

Yogesh Kumar, Pekka Marttinen

ECCV 2024posterarXiv:2403.10153
23
citations
#372

TrojVLM: Backdoor Attack Against Vision Language Models

Weimin Lyu, Lu Pang, Tengfei Ma et al.

ECCV 2024posterarXiv:2409.19232
23
citations
#373

DataDream: Few-shot Guided Dataset Generation

Jae Myung Kim, Jessica Bader, Stephan Alaniz et al.

ECCV 2024posterarXiv:2407.10910
23
citations
#374

Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint

Sixiang Chen, Tian Ye, Kai Zhang et al.

ECCV 2024posterarXiv:2409.15739
23
citations
#375

Revisit Anything: Visual Place Recognition via Image Segment Retrieval

Kartik Garg, Sai Shubodh Puligilla, Shishir N Y Kolathaya et al.

ECCV 2024posterarXiv:2409.18049
23
citations
#376

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning

Fucai Ke, Zhixi Cai, Simindokht Jahangard et al.

ECCV 2024posterarXiv:2403.12884
23
citations
#377

GeoCalib: Learning Single-image Calibration with Geometric Optimization

Alexander Veicht, Paul-Edouard Sarlin, Philipp Lindenberger et al.

ECCV 2024posterarXiv:2409.06704
23
citations
#378

milliFlow: Scene Flow Estimation on mmWave Radar Point Cloud for Human Motion Sensing

Fangqiang Ding, Zhen Luo, Peijun Zhao et al.

ECCV 2024posterarXiv:2306.17010
23
citations
#379

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

ECCV 2024posterarXiv:2404.05052
23
citations
#380

Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion

Xiang Fan, Anand Bhattad, Ranjay Krishna

ECCV 2024posterarXiv:2403.14617
23
citations
#381

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts

Byeongjun Park, Hyojun Go, Jin-Young Kim et al.

ECCV 2024posterarXiv:2403.09176
23
citations
#382

SparseLIF: High-Performance Sparse LiDAR-Camera Fusion for 3D Object Detection

Hongcheng Zhang, Liu Liang, Pengxin Zeng et al.

ECCV 2024posterarXiv:2403.07284
23
citations
#383

FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

Jianwei Zhao, Xin Li, Fan Yang et al.

ECCV 2024posterarXiv:2407.13133
23
citations
#384

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh, Aayan Yadav, Jitesh Jain et al.

ECCV 2024posterarXiv:2403.18819
23
citations
#385

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.

ECCV 2024posterarXiv:2404.01300
22
citations
#386

MotionChain: Conversational Motion Controllers via Multimodal Prompts

Biao Jiang, Xin Chen, Chi Zhang et al.

ECCV 2024posterarXiv:2404.01700
22
citations
#387

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.

ECCV 2024posterarXiv:2312.09231
22
citations
#388

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

Jie Yang, Xuesong Niu, Nan Jiang et al.

ECCV 2024posterarXiv:2407.12435
22
citations
#389

RadEdit: stress-testing biomedical vision models via diffusion image editing

Fernando Pérez-García, Sam Bond-Taylor, Pedro Sanchez et al.

ECCV 2024posterarXiv:2312.12865
22
citations
#390

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

ECCV 2024posterarXiv:2312.06573
22
citations
#391

Toward Open Vocabulary Aerial Object Detection with CLIP-Activated Student-Teacher Learning

Yan Li, Weiwei Guo, Xue Yang et al.

ECCV 2024posterarXiv:2311.11646
22
citations
#392

DIM: Dyadic Interaction Modeling for Social Behavior Generation

Minh Tran, Di Chang, Maksim Siniukov et al.

ECCV 2024poster
22
citations
#393

ViLA: Efficient Video-Language Alignment for Video Question Answering

Xijun Wang, Junbang Liang, Chun-Kai Wang et al.

ECCV 2024posterarXiv:2312.08367
22
citations
#394

Prioritized Semantic Learning for Zero-shot Instance Navigation

Xinyu Sun, Lizhao Liu, Hongyan Zhi et al.

ECCV 2024posterarXiv:2403.11650
22
citations
#395

Object-Centric Diffusion for Efficient Video Editing

Kumara Kahatapitiya, Adil Karjauv, Davide Abati et al.

ECCV 2024posterarXiv:2401.05735
22
citations
#396

OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations

Yiming Zuo, Jia Deng

ECCV 2024posterarXiv:2406.11711
22
citations
#397

PALM: Predicting Actions through Language Models

Sanghwan Kim, Daoji Huang, Yongqin Xian et al.

ECCV 2024posterarXiv:2311.17944
22
citations
#398

Text2LiDAR: Text-guided LiDAR Point Clouds Generation via Equirectangular Transformer

Yang Wu, Kaihua Zhang, Jianjun Qian et al.

ECCV 2024posterarXiv:2407.19628
22
citations
#399

WTS: A Pedestrian-Centric Traffic Video Dataset for Fine-grained Spatial-Temporal Understanding

Quan Kong, Yuki Kawana, Rajat Saini et al.

ECCV 2024posterarXiv:2407.15350
21
citations
#400

Learning to Adapt SAM for Segmenting Cross-domain Point Clouds

Xidong Peng, Runnan Chen, Feng Qiao et al.

ECCV 2024posterarXiv:2310.08820
21
citations