Most Cited ECCV "unlearning safety" Papers

2,387 papers found • Page 2 of 12

#201

Local All-Pair Correspondence for Point Tracking

Seokju Cho, Jiahui Huang, Jisu Nam et al.

ECCV 2024arXiv:2407.15420
62
citations
#202

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots

Pengxiang Ding, Han Zhao, Wenjie Zhang et al.

ECCV 2024arXiv:2312.14457
61
citations
#203

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

Yuchen Yang, Kwonjoon Lee, Behzad Dariush et al.

ECCV 2024arXiv:2407.10299
61
citations
#204

TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Jiahe Li, Jiawei Zhang, Xiao Bai et al.

ECCV 2024arXiv:2404.15264
61
citations
#205

VeCLIP: Improving CLIP Training via Visual-enriched Captions

Zhengfeng Lai, Haotian Zhang, Bowen Zhang et al.

ECCV 2024arXiv:2310.07699
61
citations
#206

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai et al.

ECCV 2024arXiv:2404.03507
60
citations
#207

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Yue Han, Junwei Zhu, Keke He et al.

ECCV 2024arXiv:2405.12970
60
citations
#208

Diffusion Models for Open-Vocabulary Segmentation

Laurynas Karazija, Iro Laina, Andrea Vedaldi et al.

ECCV 2024arXiv:2306.09316
60
citations
#209

AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models

Xuelong Dai, Kaisheng Liang, Bin Xiao

ECCV 2024arXiv:2307.12499
59
citations
#210

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Daniel Winter, Matan Cohen, Shlomi Fruchter et al.

ECCV 2024arXiv:2403.18818
59
citations
#211

SILC: Improving Vision Language Pretraining with Self-Distillation

Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai et al.

ECCV 2024arXiv:2310.13355
59
citations
#212

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation

Yi-Chia Chen, WeiHua Li, Cheng Sun et al.

ECCV 2024arXiv:2409.10542
59
citations
#213

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

hang yao, Ming LIU, Zhicun Yin et al.

ECCV 2024arXiv:2406.07487
58
citations
#214

BRAVE: Broadening the visual encoding of vision-language models

Oguzhan Fatih Kar, Alessio Tonioni, Petra Poklukar et al.

ECCV 2024arXiv:2404.07204
58
citations
#215

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

Zhen Qu, Xian Tao, Mukesh Prasad et al.

ECCV 2024arXiv:2407.12276
58
citations
#216

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Omer Dahary, Or Patashnik, Kfir Aberman et al.

ECCV 2024arXiv:2403.16990
57
citations
#217

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Xi Chen, Sida Peng, Dongchen Yang et al.

ECCV 2024arXiv:2404.11593
57
citations
#218

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao et al.

ECCV 2024arXiv:2310.17796
57
citations
#219

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction

Bencheng Liao, Shaoyu Chen, Bo Jiang et al.

ECCV 2024arXiv:2303.08815
56
citations
#220

Improving 2D Feature Representations by 3D-Aware Fine-Tuning

Yuanwen Yue, Anurag Das, Francis Engelmann et al.

ECCV 2024arXiv:2407.20229
56
citations
#221

View-Consistent 3D Editing with Gaussian Splatting

Yuxuan Wang, Xuanyu Yi, Zike Wu et al.

ECCV 2024arXiv:2403.11868
56
citations
#222

OmniSat: Self-Supervised Modality Fusion for Earth Observation

Guillaume Astruc, Nicolas Gonthier, Clement Mallet et al.

ECCV 2024arXiv:2404.08351
56
citations
#223

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration

Zhihao Liang, Qi Zhang, WENBO HU et al.

ECCV 2024arXiv:2403.11056
56
citations
#224

Latent Guard: a Safety Framework for Text-to-image Generation

Runtao Liu, Ashkan Khakzar, Jindong Gu et al.

ECCV 2024arXiv:2404.08031
56
citations
#225

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

Xiang Zhang, Yulun Zhang, Fisher Yu

ECCV 2024arXiv:2407.05878
56
citations
#226

SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians

Hiba Dahmani, Moussab Bennehar, Nathan Piasco et al.

ECCV 2024arXiv:2403.10427
55
citations
#227

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

Yuanhao Cai, Yixun Liang, Jiahao Wang et al.

ECCV 2024arXiv:2403.04116
55
citations
#228

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo, Yinyu Nie, Arthur Moreau et al.

ECCV 2024arXiv:2312.02902
55
citations
#229

LaRa: Efficient Large-Baseline Radiance Fields

Anpei Chen, Haofei Xu, Stefano Esposito et al.

ECCV 2024arXiv:2407.04699
55
citations
#230

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Yanyan Li, Chenyu Lyu, Yan Di et al.

ECCV 2024arXiv:2403.11324
55
citations
#231

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Kangle Deng, Timothy Omernick, Alexander B Weiss et al.

ECCV 2024arXiv:2402.13251
55
citations
#232

See and Think: Embodied Agent in Virtual Environment

Zhonghan Zhao, Xuan Wang, Wenhao Chai et al.

ECCV 2024arXiv:2311.15209
54
citations
#233

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Zeyu Liu, Weicong Liang, Zhanhao Liang et al.

ECCV 2024arXiv:2403.09622
54
citations
#234

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping

Jiacheng Chen, Yuefan Wu, Tan Jiaqi et al.

ECCV 2024arXiv:2403.15951
54
citations
#235

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024arXiv:2403.12034
54
citations
#236

VideoStudio: Generating Consistent-Content and Multi-Scene Videos

Fuchen Long, Zhaofan Qiu, Ting Yao et al.

ECCV 2024arXiv:2401.01256
54
citations
#237

Embodied Understanding of Driving Scenarios

Yunsong Zhou, Linyan Huang, Qingwen Bu et al.

ECCV 2024arXiv:2403.04593
54
citations
#238

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan et al.

ECCV 2024arXiv:2402.03094
53
citations
#239

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.

ECCV 2024arXiv:2305.16037
53
citations
#240

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen et al.

ECCV 2024arXiv:2405.00915
53
citations
#241

A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

Tianhe Wu, Kede Ma, Jie Liang et al.

ECCV 2024arXiv:2403.10854
53
citations
#242

A Comparative Study of Image Restoration Networks for General Backbone Network Design

Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.

ECCV 2024arXiv:2310.11881
53
citations
#243

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

David Wan, Jaemin Cho, Elias Stengel-Eskin et al.

ECCV 2024arXiv:2403.02325
53
citations
#244

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities

CHENMING ZHU, Tai Wang, Wenwei Zhang et al.

ECCV 2024arXiv:2407.01525
52
citations
#245

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

Jie Ren, Yaxin Li, Shenglai Zeng et al.

ECCV 2024arXiv:2403.11052
52
citations
#246

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

Ziying Song, Lei Yang, Shaoqing Xu et al.

ECCV 2024arXiv:2403.11848
51
citations
#247

LCM-Lookahead for Encoder-based Text-to-Image Personalization

Rinon Gal, Or Lichter, Elad Richardson et al.

ECCV 2024arXiv:2404.03620
51
citations
#248

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Lanqing Guo, Yingqing He, Haoxin Chen et al.

ECCV 2024arXiv:2402.10491
51
citations
#249

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Chongyu Fan, Jiancheng Liu, Alfred Hero et al.

ECCV 2024arXiv:2403.07362
51
citations
#250

Deep Patch Visual SLAM

Lahav Lipson, Zachary Teed, Jia Deng

ECCV 2024arXiv:2408.01654
51
citations
#251

ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions

Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik et al.

ECCV 2024arXiv:2311.17057
51
citations
#252

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

Zexiang Liu, Yangguang Li, Youtian Lin et al.

ECCV 2024arXiv:2312.08754
51
citations
#253

GVGEN: Text-to-3D Generation with Volumetric Representation

Xianglong He, Junyi Chen, Sida Peng et al.

ECCV 2024arXiv:2403.12957
51
citations
#254

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024arXiv:2403.12963
51
citations
#255

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Qilang Ye, Zitong Yu, Rui Shao et al.

ECCV 2024arXiv:2403.04640
50
citations
#256

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction

Zihao Liu, Xiaoyu Zhang, Guangwei Liu et al.

ECCV 2024arXiv:2402.17430
50
citations
#257

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection

Christos Koutlis, Symeon Papadopoulos

ECCV 2024arXiv:2402.19091
50
citations
#258

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, WEILIN WAN, Yue Yang et al.

ECCV 2024arXiv:2403.13900
50
citations
#259

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Zijie Wu, Chaohui Yu, Yanqin Jiang et al.

ECCV 2024arXiv:2404.03736
50
citations
#260

ReMamber: Referring Image Segmentation with Mamba Twister

Yuhuan Yang, Chaofan Ma, Jiangchao Yao et al.

ECCV 2024arXiv:2403.17839
50
citations
#261

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis

Kevin Xie, Tianshi Cao, Jonathan P Lorraine et al.

ECCV 2024arXiv:2403.15385
49
citations
#262

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Shicheng Li, Lei Li, Yi Liu et al.

ECCV 2024arXiv:2311.17404
49
citations
#263

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Yuru Jia, Lukas Hoyer, Shengyu Huang et al.

ECCV 2024arXiv:2312.03048
49
citations
#264

Fully Sparse 3D Occupancy Prediction

Haisong Liu, Yang Chen, Haiguang Wang et al.

ECCV 2024arXiv:2312.17118
49
citations
#265

TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models

Aditya Aravind Chinchure, Pushkar Shukla, Gaurav Bhatt et al.

ECCV 2024arXiv:2312.01261
49
citations
#266

Watch Your Steps: Local Image and Scene Editing by Text Instructions

Ashkan Mirzaei, Tristan T Aumentado-Armstrong, Marcus A Brubaker et al.

ECCV 2024arXiv:2308.08947
49
citations
#267

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer

Yu Deng, Duomin Wang, Baoyuan Wang

ECCV 2024arXiv:2403.13570
49
citations
#268

DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM

Yixuan Wu, Yizhou Wang, Shixiang Tang et al.

ECCV 2024arXiv:2403.12488
48
citations
#269

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

Yixuan Ren, Yang Zhou, Jimei Yang et al.

ECCV 2024arXiv:2402.14780
48
citations
#270

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Qiuhong Shen, Xingyi Yang, Xinchao Wang

ECCV 2024arXiv:2409.08270
48
citations
#271

UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction

Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud et al.

ECCV 2024arXiv:2403.15098
48
citations
#272

When Fast Fourier Transform Meets Transformer for Image Restoration

xingyu jiang, Xiuhui Zhang, Ning Gao et al.

ECCV 2024
48
citations
#273

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Yiming Zhao, Zhouhui Lian

ECCV 2024arXiv:2312.04884
48
citations
#274

FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models

Andrea Caraffa, Davide Boscaini, Amir Hamza et al.

ECCV 2024arXiv:2312.00947
47
citations
#275

Online Vectorized HD Map Construction using Geometry

Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding et al.

ECCV 2024arXiv:2312.03341
47
citations
#276

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Tong Shao, Zhuotao Tian, Hang Zhao et al.

ECCV 2024arXiv:2407.08268
47
citations
#277

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

Wenhao Ding, Yulong Cao, DING ZHAO et al.

ECCV 2024arXiv:2312.13303
47
citations
#278

GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval

Han Zhou, Wei Dong, Xiaohong Liu et al.

ECCV 2024arXiv:2407.12431
47
citations
#279

CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs

Yassine Ouali, Adrian Bulat, Brais Martinez et al.

ECCV 2024arXiv:2408.10433
47
citations
#280

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Qianjiang Hu, Zhimin Zhang, Wei Hu

ECCV 2024arXiv:2403.10094
47
citations
#281

R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model

Changhoon Kim, Kyle Min, Yezhou Yang

ECCV 2024arXiv:2405.16341
46
citations
#282

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

Xiaoshi Wu, Yiming Hao, Manyuan Zhang et al.

ECCV 2024arXiv:2405.00760
46
citations
#283

Click-Gaussian: Interactive Segmentation to Any 3D Gaussians

Seokhun Choi, Hyeonseop Song, Jaechul Kim et al.

ECCV 2024arXiv:2407.11793
46
citations
#284

MyVLM: Personalizing VLMs for User-Specific Queries

Yuval Alaluf, Elad Richardson, Sergey Tulyakov et al.

ECCV 2024arXiv:2403.14599
46
citations
#285

LivePhoto: Real Image Animation with Text-guided Motion Control

Xi Chen, Zhiheng Liu, Mengting Chen et al.

ECCV 2024arXiv:2312.02928
46
citations
#286

MagMax: Leveraging Model Merging for Seamless Continual Learning

Daniel Marczak, Bartlomiej Twardowski, Tomasz Trzcinski et al.

ECCV 2024arXiv:2407.06322
46
citations
#287

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

Yang Zheng, Qingqing Zhao, Guandao Yang et al.

ECCV 2024arXiv:2404.04421
46
citations
#288

BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

Rizhao Cai, Zirui Song, DAYAN GUAN et al.

ECCV 2024arXiv:2312.02896
45
citations
#289

BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Cheng Peng, Yutao Tang, Yifan Zhou et al.

ECCV 2024arXiv:2403.04926
45
citations
#290

DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model

Li Xiaofan, Zhang Yifu, Xiaoqing Ye

ECCV 2024
45
citations
#291

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh, Adam Tonderski, Joakim Johnander et al.

ECCV 2024arXiv:2404.07762
45
citations
#292

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Letian Huang, Jiayang Bai, Jie Guo et al.

ECCV 2024arXiv:2402.00752
45
citations
#293

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance

Yongwei Chen, Tengfei Wang, Tong Wu et al.

ECCV 2024arXiv:2403.12409
45
citations
#294

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model

Yulin Luo, Ruichuan An, Bocheng Zou et al.

ECCV 2024arXiv:2405.02363
45
citations
#295

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024arXiv:2403.19435
44
citations
#296

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024arXiv:2403.19588
44
citations
#297

Diffusion Reward: Learning Rewards via Conditional Video Diffusion

Tao Huang, Guangqi Jiang, Yanjie Ze et al.

ECCV 2024arXiv:2312.14134
44
citations
#298

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024arXiv:2403.18118
44
citations
#299

ParCo: Part-Coordinating Text-to-Motion Synthesis

Qiran Zou, Shangyuan Yuan, Shian Du et al.

ECCV 2024arXiv:2403.18512
44
citations
#300

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024arXiv:2312.07381
44
citations
#301

Zero-Shot Detection of AI-Generated Images

Davide Cozzolino, GIovanni Poggi, Matthias Niessner et al.

ECCV 2024arXiv:2409.15875
44
citations
#302

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang et al.

ECCV 2024
43
citations
#303

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024arXiv:2401.06825
43
citations
#304

TAPTR: Tracking Any Point with Transformers as Detection

Hongyang Li, Hao Zhang, Shilong Liu et al.

ECCV 2024arXiv:2403.13042
42
citations
#305

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024arXiv:2401.09112
42
citations
#306

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024arXiv:2403.10893
42
citations
#307

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024arXiv:2311.12897
42
citations
#308

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024arXiv:2407.14143
41
citations
#309

Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution

Xi Yang, Chenhang He, Jianqi Ma et al.

ECCV 2024arXiv:2312.00853
41
citations
#310

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024arXiv:2409.09756
41
citations
#311

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

KUNPENG SONG, Yizhe Zhu, Bingchen Liu et al.

ECCV 2024arXiv:2404.05674
41
citations
#312

MegaScenes: Scene-Level View Synthesis at Scale

Joseph Tung, Gene Chou, Ruojin Cai et al.

ECCV 2024arXiv:2406.11819
40
citations
#313

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Zhijian Huang, Tao Tang, Shaoxiang Chen et al.

ECCV 2024arXiv:2408.13890
40
citations
#314

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

Longxiang Tang, Zhuotao Tian, Kai Li et al.

ECCV 2024arXiv:2407.05342
40
citations
#315

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024arXiv:2403.10050
40
citations
#316

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ECCV 2024arXiv:2311.09999
40
citations
#317

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024arXiv:2403.13129
40
citations
#318

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024arXiv:2403.19589
40
citations
#319

Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks

Jing Wu, Mehrtash Harandi

ECCV 2024arXiv:2401.06187
39
citations
#320

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Young Kyun Jang, Dat B Huynh, Ashish Shah et al.

ECCV 2024arXiv:2405.00571
39
citations
#321

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Tan, Muhammad Ferjad Naeem et al.

ECCV 2024arXiv:2311.16241
39
citations
#322

AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering

Xiuyuan Chen, Yuan Lin, Yuchen Zhang et al.

ECCV 2024arXiv:2311.14906
39
citations
#323

GalLop: Learning global and local prompts for vision-language models

Marc Lafon, Elias Ramzi, Clément Rambour et al.

ECCV 2024arXiv:2407.01400
39
citations
#324

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

Matteo Bortolon, Theodoros Tsesmelis, Stuart James et al.

ECCV 2024arXiv:2407.15484
39
citations
#325

Decoupling Common and Unique Representations for Multimodal Self-supervised Learning

Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham et al.

ECCV 2024arXiv:2309.05300
39
citations
#326

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

Chao Xu, Ang Li, Linghao Chen et al.

ECCV 2024arXiv:2408.10195
39
citations
#327

SMooDi: Stylized Motion Diffusion Model

Lei Zhong, Yiming Xie, Varun Jampani et al.

ECCV 2024arXiv:2407.12783
38
citations
#328

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

Junjie Guo, Chenqiang Gao, Fangcen liu et al.

ECCV 2024arXiv:2403.00326
38
citations
#329

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

ECCV 2024arXiv:2407.10862
38
citations
#330

Diagnosing and Re-learning for Balanced Multimodal Learning

Yake Wei, Siwei Li, Ruoxuan Feng et al.

ECCV 2024arXiv:2407.09705
38
citations
#331

Weighted Ensemble Models Are Strong Continual Learners

Imad Eddine Marouf, Subhankar Roy, Enzo Tartaglione et al.

ECCV 2024arXiv:2312.08977
38
citations
#332

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Mengting Chen, Xi Chen, Zhonghua Zhai et al.

ECCV 2024arXiv:2403.12965
38
citations
#333

Solving Motion Planning Tasks with a Scalable Generative Model

Yihan Hu, Siqi Chai, Zhening Yang et al.

ECCV 2024arXiv:2407.02797
38
citations
#334

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024arXiv:2405.15763
37
citations
#335

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting

Zhenglin Zhou, Fan Ma, Hehe Fan et al.

ECCV 2024arXiv:2402.06149
37
citations
#336

MEVG : Multi-event Video Generation with Text-to-Video Models

Gyeongrok Oh, Jaehwan Jeong, Sieun Kim et al.

ECCV 2024arXiv:2312.04086
37
citations
#337

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao Tianchen, Xuefei Ning, Tongcheng Fang et al.

ECCV 2024arXiv:2405.17873
37
citations
#338

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Jiyao Zhang, Weiyao Huang, Bo Peng et al.

ECCV 2024arXiv:2406.04316
37
citations
#339

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

Zhen Zhao, Zicheng Wang, Dian Yu et al.

ECCV 2024arXiv:2311.17325
37
citations
#340

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024arXiv:2407.13761
37
citations
#341

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024arXiv:2312.00937
37
citations
#342

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed abdelrahman et al.

ECCV 2024arXiv:2407.12679
37
citations
#343

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024arXiv:2311.16567
36
citations
#344

DragVideo: Interactive Drag-style Video Editing

Yufan Deng, Ruida Wang, Yuhao ZHANG et al.

ECCV 2024arXiv:2312.02216
36
citations
#345

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Siyu Jiao, hongguang Zhu, Yunchao Wei et al.

ECCV 2024arXiv:2408.00744
36
citations
#346

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024arXiv:2311.13627
36
citations
#347

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024arXiv:2312.09128
36
citations
#348

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024arXiv:2409.14444
36
citations
#349

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

YUXUAN SUN, Hao Wu, Chenglu Zhu et al.

ECCV 2024arXiv:2401.16355
36
citations
#350

Pyramid Diffusion for Fine 3D Large Scene Generation

Yuheng Liu, Xinke Li, Xueting Li et al.

ECCV 2024arXiv:2311.12085
36
citations
#351

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Seung Hyun Lee, Yinxiao Li, Junjie Ke et al.

ECCV 2024arXiv:2401.05675
36
citations
#352

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Marko Mihajlovic, Sergey Prokudin, Siyu Tang et al.

ECCV 2024arXiv:2409.11211
36
citations
#353

m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Zixian Ma, Weikai Huang, Jieyu Zhang et al.

ECCV 2024arXiv:2403.11085
36
citations
#354

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024arXiv:2402.03310
36
citations
#355

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024arXiv:2403.18274
36
citations
#356

FunQA: Towards Surprising Video Comprehension

Binzhu Xie, Sicheng Zhang, Zitang Zhou et al.

ECCV 2024arXiv:2306.14899
36
citations
#357

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024arXiv:2407.13519
36
citations
#358

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024arXiv:2407.01702
35
citations
#359

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024arXiv:2305.14428
35
citations
#360

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024arXiv:2310.06744
35
citations
#361

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024arXiv:2408.05868
35
citations
#362

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024arXiv:2403.05023
35
citations
#363

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024arXiv:2408.14176
35
citations
#364

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024arXiv:2311.16917
35
citations
#365

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024arXiv:2406.14556
35
citations
#366

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024arXiv:2404.03590
34
citations
#367

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024arXiv:2403.12445
34
citations
#368

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024arXiv:2407.12777
34
citations
#369

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024arXiv:2311.11261
34
citations
#370

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024arXiv:2402.03634
34
citations
#371

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024arXiv:2403.18211
34
citations
#372

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024arXiv:2403.07508
34
citations
#373

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024
34
citations
#374

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024arXiv:2403.11781
34
citations
#375

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024arXiv:2403.04697
34
citations
#376

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024arXiv:2312.05295
34
citations
#377

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024arXiv:2403.05659
34
citations
#378

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024arXiv:2403.05049
34
citations
#379

Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

I-HSIANG CHEN, Wei-Ting Chen, Yu-Wei Liu et al.

ECCV 2024arXiv:2405.10589
34
citations
#380

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024arXiv:2311.07152
34
citations
#381

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024arXiv:2407.09983
34
citations
#382

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024arXiv:2407.02077
33
citations
#383

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

Chenxin Li, Xinyu Liu, Cheng Wang et al.

ECCV 2024arXiv:2407.05540
33
citations
#384

Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Jinxing Zhou, Dan Guo, Yuxin Mao et al.

ECCV 2024arXiv:2407.08126
33
citations
#385

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.

ECCV 2024arXiv:2404.03010
33
citations
#386

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Zicong Fan, Takehiko Ohkawa, Linlin Yang et al.

ECCV 2024arXiv:2403.16428
33
citations
#387

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

Xinzhou Wang, Yikai Wang, junliang ye et al.

ECCV 2024arXiv:2312.03795
33
citations
#388

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024arXiv:2404.13679
33
citations
#389

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024arXiv:2310.04900
33
citations
#390

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

Bowen Zhang, Yiji Cheng, Chunyu Wang et al.

ECCV 2024arXiv:2407.06938
32
citations
#391

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Ye Liu, Jixuan He, Wanhua Li et al.

ECCV 2024arXiv:2404.00801
32
citations
#392

Exact Diffusion Inversion via Bidirectional Integration Approximation

Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn

ECCV 2024
32
citations
#393

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross et al.

ECCV 2024arXiv:2404.08580
32
citations
#394

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Jingyi Lu, Xinghui Li, Kai Han

ECCV 2024arXiv:2407.18247
32
citations
#395

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024arXiv:2312.08372
32
citations
#396

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.

ECCV 2024arXiv:2407.20337
32
citations
#397

ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling

Siming Yan, Min Bai, Weifeng Chen et al.

ECCV 2024arXiv:2402.06118
32
citations
#398

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

XINYUAN GAO, Songlin Dong, Yuhang He et al.

ECCV 2024arXiv:2407.10281
32
citations
#399

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024arXiv:2408.05475
32
citations
#400

Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention

Zuyao Chen, Jinlin Wu, Zhen Lei et al.

ECCV 2024arXiv:2311.10988
31
citations