Most Cited ECCV "unsafe image generation" Papers

2,387 papers found • Page 2 of 12

Filters:Most Cited ECCV unsafe image generation Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#201

Local All-Pair Correspondence for Point Tracking

Seokju Cho, Jiahui Huang, Jisu Nam et al.

ECCV 2024arXiv:2407.15420

citations

#202

QUAR-VLA: Vision-Language-Action Model for Quadruped Robots

Pengxiang Ding, Han Zhao, Wenjie Zhang et al.

ECCV 2024arXiv:2312.14457

citations

#203

Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Models

Yuchen Yang, Kwonjoon Lee, Behzad Dariush et al.

ECCV 2024arXiv:2407.10299

citations

#204

TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting

Jiahe Li, Jiawei Zhang, Xiao Bai et al.

ECCV 2024arXiv:2404.15264

citations

#205

VeCLIP: Improving CLIP Training via Visual-enriched Captions

Zhengfeng Lai, Haotian Zhang, Bowen Zhang et al.

ECCV 2024arXiv:2310.07699

citations

#206

DQ-DETR: DETR with Dynamic Query for Tiny Object Detection

Yi-Xin Huang, Hou-I Liu, Hong-Han Shuai et al.

ECCV 2024arXiv:2404.03507

citations

#207

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Yue Han, Junwei Zhu, Keke He et al.

ECCV 2024arXiv:2405.12970

citations

#208

Diffusion Models for Open-Vocabulary Segmentation

Laurynas Karazija, Iro Laina, Andrea Vedaldi et al.

ECCV 2024arXiv:2306.09316

citations

#209

AdvDiff: Generating Unrestricted Adversarial Examples using Diffusion Models

Xuelong Dai, Kaisheng Liang, Bin Xiao

ECCV 2024arXiv:2307.12499

citations

#210

ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion

Daniel Winter, Matan Cohen, Shlomi Fruchter et al.

ECCV 2024arXiv:2403.18818

citations

#211

SILC: Improving Vision Language Pretraining with Self-Distillation

Muhammad Ferjad Naeem, Yongqin Xian, Xiaohua Zhai et al.

ECCV 2024arXiv:2310.13355

citations

#212

SAM4MLLM: Enhance Multi-Modal Large Language Model for Referring Expression Segmentation

Yi-Chia Chen, WeiHua Li, Cheng Sun et al.

ECCV 2024arXiv:2409.10542

citations

#213

GLAD: Towards Better Reconstruction with Global and Local Adaptive Diffusion Models for Unsupervised Anomaly Detection

hang yao, Ming LIU, Zhicun Yin et al.

ECCV 2024arXiv:2406.07487

citations

#214

BRAVE: Broadening the visual encoding of vision-language models

Oguzhan Fatih Kar, Alessio Tonioni, Petra Poklukar et al.

ECCV 2024arXiv:2404.07204

citations

#215

VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

Zhen Qu, Xian Tao, Mukesh Prasad et al.

ECCV 2024arXiv:2407.12276

citations

#216

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Omer Dahary, Or Patashnik, Kfir Aberman et al.

ECCV 2024arXiv:2403.16990

citations

#217

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Xi Chen, Sida Peng, Dongchen Yang et al.

ECCV 2024arXiv:2404.11593

citations

#218

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Zhaoyang Liu, Zeqiang Lai, Zhangwei Gao et al.

ECCV 2024arXiv:2310.17796

citations

#219

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction

Bencheng Liao, Shaoyu Chen, Bo Jiang et al.

ECCV 2024arXiv:2303.08815

citations

#220

Improving 2D Feature Representations by 3D-Aware Fine-Tuning

Yuanwen Yue, Anurag Das, Francis Engelmann et al.

ECCV 2024arXiv:2407.20229

citations

#221

View-Consistent 3D Editing with Gaussian Splatting

Yuxuan Wang, Xuanyu Yi, Zike Wu et al.

ECCV 2024arXiv:2403.11868

citations

#222

OmniSat: Self-Supervised Modality Fusion for Earth Observation

Guillaume Astruc, Nicolas Gonthier, Clement Mallet et al.

ECCV 2024arXiv:2404.08351

citations

#223

Analytic-Splatting: Anti-Aliased 3D Gaussian Splatting via Analytic Integration

Zhihao Liang, Qi Zhang, WENBO HU et al.

ECCV 2024arXiv:2403.11056

citations

#224

Latent Guard: a Safety Framework for Text-to-image Generation

Runtao Liu, Ashkan Khakzar, Jindong Gu et al.

ECCV 2024arXiv:2404.08031

citations

#225

HiT-SR: Hierarchical Transformer for Efficient Image Super-Resolution

Xiang Zhang, Yulun Zhang, Fisher Yu

ECCV 2024arXiv:2407.05878

citations

#226

SWAG: Splatting in the Wild images with Appearance-conditioned Gaussians

Hiba Dahmani, Moussab Bennehar, Nathan Piasco et al.

ECCV 2024arXiv:2403.10427

citations

#227

Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis

Yuanhao Cai, Yixun Liang, Jiahao Wang et al.

ECCV 2024arXiv:2403.04116

citations

#228

HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting

Helisa Dhamo, Yinyu Nie, Arthur Moreau et al.

ECCV 2024arXiv:2312.02902

citations

#229

LaRa: Efficient Large-Baseline Radiance Fields

Anpei Chen, Haofei Xu, Stefano Esposito et al.

ECCV 2024arXiv:2407.04699

citations

#230

GeoGaussian: Geometry-aware Gaussian Splatting for Scene Rendering

Yanyan Li, Chenyu Lyu, Yan Di et al.

ECCV 2024arXiv:2403.11324

citations

#231

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Kangle Deng, Timothy Omernick, Alexander B Weiss et al.

ECCV 2024arXiv:2402.13251

citations

#232

See and Think: Embodied Agent in Virtual Environment

Zhonghan Zhao, Xuan Wang, Wenhao Chai et al.

ECCV 2024arXiv:2311.15209

citations

#233

Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering

Zeyu Liu, Weicong Liang, Zhanhao Liang et al.

ECCV 2024arXiv:2403.09622

citations

#234

MapTracker: Tracking with Strided Memory Fusion for Consistent Vector HD Mapping

Jiacheng Chen, Yuefan Wu, Tan Jiaqi et al.

ECCV 2024arXiv:2403.15951

citations

#235

VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models

Junlin Han, Filippos Kokkinos, Philip Torr

ECCV 2024arXiv:2403.12034

citations

#236

VideoStudio: Generating Consistent-Content and Multi-Scene Videos

Fuchen Long, Zhaofan Qiu, Ting Yao et al.

ECCV 2024arXiv:2401.01256

citations

#237

Embodied Understanding of Driving Scenarios

Yunsong Zhou, Linyan Huang, Qingwen Bu et al.

ECCV 2024arXiv:2403.04593

citations

#238

Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector

Yuqian Fu, Yu Wang, Yixuan Pan et al.

ECCV 2024arXiv:2402.03094

citations

#239

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.

ECCV 2024arXiv:2305.16037

citations

#240

EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion

Guangyao Zhai, Evin Pınar Örnek, Dave Zhenyu Chen et al.

ECCV 2024arXiv:2405.00915

citations

#241

A Comprehensive Study of Multimodal Large Language Models for Image Quality Assessment

Tianhe Wu, Kede Ma, Jie Liang et al.

ECCV 2024arXiv:2403.10854

citations

#242

A Comparative Study of Image Restoration Networks for General Backbone Network Design

Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.

ECCV 2024arXiv:2310.11881

citations

#243

Contrastive Region Guidance: Improving Grounding in Vision-Language Models without Training

David Wan, Jaemin Cho, Elias Stengel-Eskin et al.

ECCV 2024arXiv:2403.02325

citations

#244

ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities

CHENMING ZHU, Tai Wang, Wenwei Zhang et al.

ECCV 2024arXiv:2407.01525

citations

#245

Unveiling and Mitigating Memorization in Text-to-image Diffusion Models through Cross Attention

Jie Ren, Yaxin Li, Shenglai Zeng et al.

ECCV 2024arXiv:2403.11052

citations

#246

GraphBEV: Towards Robust BEV Feature Alignment for Multi-Modal 3D Object Detection

Ziying Song, Lei Yang, Shaoqing Xu et al.

ECCV 2024arXiv:2403.11848

citations

#247

LCM-Lookahead for Encoder-based Text-to-Image Personalization

Rinon Gal, Or Lichter, Elad Richardson et al.

ECCV 2024arXiv:2404.03620

citations

#248

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

Lanqing Guo, Yingqing He, Haoxin Chen et al.

ECCV 2024arXiv:2402.10491

citations

#249

Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning

Chongyu Fan, Jiancheng Liu, Alfred Hero et al.

ECCV 2024arXiv:2403.07362

citations

#250

Deep Patch Visual SLAM

Lahav Lipson, Zachary Teed, Jia Deng

ECCV 2024arXiv:2408.01654

citations

#251

ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions

Anindita Ghosh, Rishabh Dabral, Vladislav Golyanik et al.

ECCV 2024arXiv:2311.17057

citations

#252

UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation

Zexiang Liu, Yangguang Li, Youtian Lin et al.

ECCV 2024arXiv:2312.08754

citations

#253

GVGEN: Text-to-3D Generation with Volumetric Representation

Xianglong He, Junyi Chen, Sida Peng et al.

ECCV 2024arXiv:2403.12957

citations

#254

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024arXiv:2403.12963

citations

#255

CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios

Qilang Ye, Zitong Yu, Rui Shao et al.

ECCV 2024arXiv:2403.04640

citations

#256

Leveraging Enhanced Queries of Point Sets for Vectorized Map Construction

Zihao Liu, Xiaoyu Zhang, Guangwei Liu et al.

ECCV 2024arXiv:2402.17430

citations

#257

Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection

Christos Koutlis, Symeon Papadopoulos

ECCV 2024arXiv:2402.19091

citations

#258

CoMo: Controllable Motion Generation through Language Guided Pose Code Editing

Yiming Huang, WEILIN WAN, Yue Yang et al.

ECCV 2024arXiv:2403.13900

citations

#259

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer

Zijie Wu, Chaohui Yu, Yanqin Jiang et al.

ECCV 2024arXiv:2404.03736

citations

#260

ReMamber: Referring Image Segmentation with Mamba Twister

Yuhuan Yang, Chaofan Ma, Jiangchao Yao et al.

ECCV 2024arXiv:2403.17839

citations

#261

LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis

Kevin Xie, Tianshi Cao, Jonathan P Lorraine et al.

ECCV 2024arXiv:2403.15385

citations

#262

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

Shicheng Li, Lei Li, Yi Liu et al.

ECCV 2024arXiv:2311.17404

citations

#263

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Yuru Jia, Lukas Hoyer, Shengyu Huang et al.

ECCV 2024arXiv:2312.03048

citations

#264

Fully Sparse 3D Occupancy Prediction

Haisong Liu, Yang Chen, Haiguang Wang et al.

ECCV 2024arXiv:2312.17118

citations

#265

TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models

Aditya Aravind Chinchure, Pushkar Shukla, Gaurav Bhatt et al.

ECCV 2024arXiv:2312.01261

citations

#266

Watch Your Steps: Local Image and Scene Editing by Text Instructions

Ashkan Mirzaei, Tristan T Aumentado-Armstrong, Marcus A Brubaker et al.

ECCV 2024arXiv:2308.08947

citations

#267

Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer

Yu Deng, Duomin Wang, Baoyuan Wang

ECCV 2024arXiv:2403.13570

citations

#268

DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM

Yixuan Wu, Yizhou Wang, Shixiang Tang et al.

ECCV 2024arXiv:2403.12488

citations

#269

Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models

Yixuan Ren, Yang Zhou, Jimei Yang et al.

ECCV 2024arXiv:2402.14780

citations

#270

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Qiuhong Shen, Xingyi Yang, Xinchao Wang

ECCV 2024arXiv:2409.08270

citations

#271

UniTraj: A Unified Framework for Scalable Vehicle Trajectory Prediction

Lan Feng, Mohammadhossein Bahari, Kaouther Messaoud et al.

ECCV 2024arXiv:2403.15098

citations

#272

When Fast Fourier Transform Meets Transformer for Image Restoration

xingyu jiang, Xiuhui Zhang, Ning Gao et al.

ECCV 2024

citations

#273

UDiffText: A Unified Framework for High-quality Text Synthesis in Arbitrary Images via Character-aware Diffusion Models

Yiming Zhao, Zhouhui Lian

ECCV 2024arXiv:2312.04884

citations

#274

FreeZe: Training-free zero-shot 6D pose estimation with geometric and vision foundation models

Andrea Caraffa, Davide Boscaini, Amir Hamza et al.

ECCV 2024arXiv:2312.00947

citations

#275

Online Vectorized HD Map Construction using Geometry

Zhixin Zhang, Yiyuan Zhang, Xiaohan Ding et al.

ECCV 2024arXiv:2312.03341

citations

#276

Explore the Potential of CLIP for Training-Free Open Vocabulary Semantic Segmentation

Tong Shao, Zhuotao Tian, Hang Zhao et al.

ECCV 2024arXiv:2407.08268

citations

#277

RealGen: Retrieval Augmented Generation for Controllable Traffic Scenarios

Wenhao Ding, Yulong Cao, DING ZHAO et al.

ECCV 2024arXiv:2312.13303

citations

#278

GLARE: Low Light Image Enhancement via Generative Latent Feature based Codebook Retrieval

Han Zhou, Wei Dong, Xiaohong Liu et al.

ECCV 2024arXiv:2407.12431

citations

#279

CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs

Yassine Ouali, Adrian Bulat, Brais Martinez et al.

ECCV 2024arXiv:2408.10433

citations

#280

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Qianjiang Hu, Zhimin Zhang, Wei Hu

ECCV 2024arXiv:2403.10094

citations

#281

R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model

Changhoon Kim, Kyle Min, Yezhou Yang

ECCV 2024arXiv:2405.16341

citations

#282

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

Xiaoshi Wu, Yiming Hao, Manyuan Zhang et al.

ECCV 2024arXiv:2405.00760

citations

#283

Click-Gaussian: Interactive Segmentation to Any 3D Gaussians

Seokhun Choi, Hyeonseop Song, Jaechul Kim et al.

ECCV 2024arXiv:2407.11793

citations

#284

MyVLM: Personalizing VLMs for User-Specific Queries

Yuval Alaluf, Elad Richardson, Sergey Tulyakov et al.

ECCV 2024arXiv:2403.14599

citations

#285

LivePhoto: Real Image Animation with Text-guided Motion Control

Xi Chen, Zhiheng Liu, Mengting Chen et al.

ECCV 2024arXiv:2312.02928

citations

#286

MagMax: Leveraging Model Merging for Seamless Continual Learning

Daniel Marczak, Bartlomiej Twardowski, Tomasz Trzcinski et al.

ECCV 2024arXiv:2407.06322

citations

#287

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

Yang Zheng, Qingqing Zhao, Guandao Yang et al.

ECCV 2024arXiv:2404.04421

citations

#288

BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

Rizhao Cai, Zirui Song, DAYAN GUAN et al.

ECCV 2024arXiv:2312.02896

citations

#289

BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling

Cheng Peng, Yutao Tang, Yifan Zhou et al.

ECCV 2024arXiv:2403.04926

citations

#290

DrivingDiffusion: Layout-Guided Multi-View Driving Scenarios Video Generation with Latent Diffusion Model

Li Xiaofan, Zhang Yifu, Xiaoqing Ye

ECCV 2024

citations

#291

NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving

William Ljungbergh, Adam Tonderski, Joakim Johnander et al.

ECCV 2024arXiv:2404.07762

citations

#292

On the Error Analysis of 3D Gaussian Splatting and an Optimal Projection Strategy

Letian Huang, Jiayang Bai, Jie Guo et al.

ECCV 2024arXiv:2402.00752

citations

#293

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance

Yongwei Chen, Tengfei Wang, Tong Wu et al.

ECCV 2024arXiv:2403.12409

citations

#294

LLM as Dataset Analyst: Subpopulation Structure Discovery with Large Language Model

Yulin Luo, Ruichuan An, Bocheng Zou et al.

ECCV 2024arXiv:2405.02363

citations

#295

BAMM: Bidirectional Autoregressive Motion Model

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Pu Wang et al.

ECCV 2024arXiv:2403.19435

citations

#296

DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs

Donghyun Kim, Byeongho Heo, Dongyoon Han

ECCV 2024arXiv:2403.19588

citations

#297

Diffusion Reward: Learning Rewards via Conditional Video Diffusion

Tao Huang, Guangqi Jiang, Yanjie Ze et al.

ECCV 2024arXiv:2312.14134

citations

#298

EgoLifter: Open-world 3D Segmentation for Egocentric Perception

Qiao Gu, Zhaoyang Lv, Duncan Frost et al.

ECCV 2024arXiv:2403.18118

citations

#299

ParCo: Part-Coordinating Text-to-Motion Synthesis

Qiran Zou, Shangyuan Yuan, Shian Du et al.

ECCV 2024arXiv:2403.18512

citations

#300

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Hallee E. Wong, Marianne Rakic, John Guttag et al.

ECCV 2024arXiv:2312.07381

citations

#301

Zero-Shot Detection of AI-Generated Images

Davide Cozzolino, GIovanni Poggi, Matthias Niessner et al.

ECCV 2024arXiv:2409.15875

citations

#302

Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)

Qifeng Li, Xiaosong Jia, Shaobo Wang et al.

ECCV 2024

citations

#303

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen et al.

ECCV 2024arXiv:2401.06825

citations

#304

TAPTR: Tracking Any Point with Transformers as Detection

Hongyang Li, Hao Zhang, Shilong Liu et al.

ECCV 2024arXiv:2403.13042

citations

#305

Stream Query Denoising for Vectorized HD-Map Construction

Shuo Wang, Fan Jia, Weixin Mao et al.

ECCV 2024arXiv:2401.09112

citations

#306

A Watermark-Conditioned Diffusion Model for IP Protection

Rui Min, Sen Li, Hongyang Chen et al.

ECCV 2024arXiv:2403.10893

citations

#307

A Compact Dynamic 3D Gaussian Representation for Real-Time Dynamic View Synthesis

Kai Katsumata, Duc Minh Vo, Hideki Nakayama

ECCV 2024arXiv:2311.12897

citations

#308

Class-Incremental Learning with CLIP: Adaptive Representation Adjustment and Parameter Fusion

Linlan Huang, Xusheng Cao, Haori Lu et al.

ECCV 2024arXiv:2407.14143

citations

#309

Motion-Guided Latent Diffusion for Temporally Consistent Real-world Video Super-resolution

Xi Yang, Chenhang He, Jianqi Ma et al.

ECCV 2024arXiv:2312.00853

citations

#310

MesonGS: Post-training Compression of 3D Gaussians via Efficient Attribute Transformation

Shuzhao Xie, Weixiang Zhang, Chen Tang et al.

ECCV 2024arXiv:2409.09756

citations

#311

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

KUNPENG SONG, Yizhe Zhu, Bingchen Liu et al.

ECCV 2024arXiv:2404.05674

citations

#312

MegaScenes: Scene-Level View Synthesis at Scale

Joseph Tung, Gene Chou, Ruojin Cai et al.

ECCV 2024arXiv:2406.11819

citations

#313

Making Large Language Models Better Planners with Reasoning-Decision Alignment

Zhijian Huang, Tao Tang, Shaoxiang Chen et al.

ECCV 2024arXiv:2408.13890

citations

#314

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

Longxiang Tang, Zhuotao Tian, Kai Li et al.

ECCV 2024arXiv:2407.05342

citations

#315

Texture-GS: Disentangle the Geometry and Texture for 3D Gaussian Splatting Editing

Tian-Xing Xu, WENBO HU, Yu-Kun Lai et al.

ECCV 2024arXiv:2403.10050

citations

#316

TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ECCV 2024arXiv:2311.09999

citations

#317

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljoša Ošep, Tim Meinhardt, Francesco Ferroni et al.

ECCV 2024arXiv:2403.13129

citations

#318

TOD3Cap: Towards 3D Dense Captioning in Outdoor Scenes

Bu Jin, Yupeng Zheng, Pengfei Li et al.

ECCV 2024arXiv:2403.19589

citations

#319

Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks

Jing Wu, Mehrtash Harandi

ECCV 2024arXiv:2401.06187

citations

#320

Spherical Linear Interpolation and Text-Anchoring for Zero-shot Composed Image Retrieval

Young Kyun Jang, Dat B Huynh, Ashish Shah et al.

ECCV 2024arXiv:2405.00571

citations

#321

SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance

Lukas Hoyer, David Tan, Muhammad Ferjad Naeem et al.

ECCV 2024arXiv:2311.16241

citations

#322

AutoEval-Video: An Automatic Benchmark for Assessing Large Vision Language Models in Open-Ended Video Question Answering

Xiuyuan Chen, Yuan Lin, Yuchen Zhang et al.

ECCV 2024arXiv:2311.14906

citations

#323

GalLop: Learning global and local prompts for vision-language models

Marc Lafon, Elias Ramzi, Clément Rambour et al.

ECCV 2024arXiv:2407.01400

citations

#324

6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model

Matteo Bortolon, Theodoros Tsesmelis, Stuart James et al.

ECCV 2024arXiv:2407.15484

citations

#325

Decoupling Common and Unique Representations for Multimodal Self-supervised Learning

Yi Wang, Conrad M Albrecht, Nassim Ait Ali Braham et al.

ECCV 2024arXiv:2309.05300

citations

#326

SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views

Chao Xu, Ang Li, Linghao Chen et al.

ECCV 2024arXiv:2408.10195

citations

#327

SMooDi: Stylized Motion Diffusion Model

Lei Zhong, Yiming Xie, Varun Jampani et al.

ECCV 2024arXiv:2407.12783

citations

#328

DAMSDet: Dynamic Adaptive Multispectral Detection Transformer with Competitive Query Selection and Adaptive Feature Fusion

Junjie Guo, Chenqiang Gao, Fangcen liu et al.

ECCV 2024arXiv:2403.00326

citations

#329

R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection

Zheyuan Zhou, Wang Le, Naiyu Fang et al.

ECCV 2024arXiv:2407.10862

citations

#330

Diagnosing and Re-learning for Balanced Multimodal Learning

Yake Wei, Siwei Li, Ruoxuan Feng et al.

ECCV 2024arXiv:2407.09705

citations

#331

Weighted Ensemble Models Are Strong Continual Learners

Imad Eddine Marouf, Subhankar Roy, Enzo Tartaglione et al.

ECCV 2024arXiv:2312.08977

citations

#332

Wear-Any-Way: Manipulable Virtual Try-on via Sparse Correspondence Alignment

Mengting Chen, Xi Chen, Zhonghua Zhai et al.

ECCV 2024arXiv:2403.12965

citations

#333

Solving Motion Planning Tasks with a Scalable Generative Model

Yihan Hu, Siqi Chai, Zhening Yang et al.

ECCV 2024arXiv:2407.02797

citations

#334

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

Ke Fan, Junshu Tang, Weijian Cao et al.

ECCV 2024arXiv:2405.15763

citations

#335

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting

Zhenglin Zhou, Fan Ma, Hehe Fan et al.

ECCV 2024arXiv:2402.06149

citations

#336

MEVG : Multi-event Video Generation with Text-to-Video Models

Gyeongrok Oh, Jaehwan Jeong, Sieun Kim et al.

ECCV 2024arXiv:2312.04086

citations

#337

MixDQ: Memory-Efficient Few-Step Text-to-Image Diffusion Models with Metric-Decoupled Mixed Precision Quantization

Zhao Tianchen, Xuefei Ning, Tongcheng Fang et al.

ECCV 2024arXiv:2405.17873

citations

#338

Omni6DPose: A Benchmark and Model for Universal 6D Object Pose Estimation and Tracking

Jiyao Zhang, Weiyao Huang, Bo Peng et al.

ECCV 2024arXiv:2406.04316

citations

#339

Alternate Diverse Teaching for Semi-supervised Medical Image Segmentation

Zhen Zhao, Zicheng Wang, Dian Yu et al.

ECCV 2024arXiv:2311.17325

citations

#340

SegPoint: Segment Any Point Cloud via Large Language Model

Shuting He, Henghui Ding, Xudong Jiang et al.

ECCV 2024arXiv:2407.13761

citations

#341

Video Question Answering with Procedural Programs

Rohan Choudhury, Koichiro Niinuma, Kris Kitani et al.

ECCV 2024arXiv:2312.00937

citations

#342

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos

Kirolos Ataallah, Xiaoqian Shen, Eslam mohamed abdelrahman et al.

ECCV 2024arXiv:2407.12679

citations

#343

MobileDiffusion: Instant Text-to-Image Generation on Mobile Devices

Yang Zhao, Zhisheng Xiao, Yanwu Xu et al.

ECCV 2024arXiv:2311.16567

citations

#344

DragVideo: Interactive Drag-style Video Editing

Yufan Deng, Ruida Wang, Yuhao ZHANG et al.

ECCV 2024arXiv:2312.02216

citations

#345

Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation

Siyu Jiao, hongguang Zhu, Yunchao Wei et al.

ECCV 2024arXiv:2408.00744

citations

#346

Vamos: Versatile Action Models for Video Understanding

Shijie Wang, Qi Zhao, Minh Quan et al.

ECCV 2024arXiv:2311.13627

citations

#347

Tokenize Anything via Prompting

Ting Pan, Lulu Tang, Xinlong Wang et al.

ECCV 2024arXiv:2312.09128

citations

#348

Fake It till You Make It: Curricular Dynamic Forgery Augmentations towards General Deepfake Detection

Yuzhen Lin, Wentang Song, Bin Li et al.

ECCV 2024arXiv:2409.14444

citations

#349

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

YUXUAN SUN, Hao Wu, Chenglu Zhu et al.

ECCV 2024arXiv:2401.16355

citations

#350

Pyramid Diffusion for Fine 3D Large Scene Generation

Yuheng Liu, Xinke Li, Xueting Li et al.

ECCV 2024arXiv:2311.12085

citations

#351

Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation

Seung Hyun Lee, Yinxiao Li, Junjie Ke et al.

ECCV 2024arXiv:2401.05675

citations

#352

SplatFields: Neural Gaussian Splats for Sparse 3D and 4D Reconstruction

Marko Mihajlovic, Sergey Prokudin, Siyu Tang et al.

ECCV 2024arXiv:2409.11211

citations

#353

m&m’s: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks

Zixian Ma, Weikai Huang, Jieyu Zhang et al.

ECCV 2024arXiv:2403.11085

citations

#354

V-IRL: Grounding Virtual Intelligence in Real Life

Jihan YANG, Runyu Ding, Ellis L Brown et al.

ECCV 2024arXiv:2402.03310

citations

#355

DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment

Jiuming Liu, Dong Zhuo, Zhiheng Feng et al.

ECCV 2024arXiv:2403.18274

citations

#356

FunQA: Towards Surprising Video Comprehension

Binzhu Xie, Sicheng Zhang, Zitang Zhou et al.

ECCV 2024arXiv:2306.14899

citations

#357

GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding

Changshuo Wang, Meiqing Wu, Siew-Kei Lam et al.

ECCV 2024arXiv:2407.13519

citations

#358

SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving

Qingwen Zhang, Yi Yang, Peizheng Li et al.

ECCV 2024arXiv:2407.01702

citations

#359

Prompting Language-Informed Distribution for Compositional Zero-Shot Learning

Wentao Bao, Lichang Chen, Heng Huang et al.

ECCV 2024arXiv:2305.14428

citations

#360

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024arXiv:2310.06744

citations

#361

LaWa: Using Latent Space for In-Generation Image Watermarking

Ahmad Rezaei, Mohammad Akbari, Saeed Ranjbar Alvar et al.

ECCV 2024arXiv:2408.05868

citations

#362

Towards Multimodal Sentiment Analysis Debiasing via Bias Purification

Dingkang Yang, Mingcheng Li, Dongling Xiao et al.

ECCV 2024arXiv:2403.05023

citations

#363

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

Trung Dao, Thuan Nguyen, Thanh Van Le et al.

ECCV 2024arXiv:2408.14176

citations

#364

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

ECCV 2024arXiv:2311.16917

citations

#365

Asynchronous Large Language Model Enhanced Planner for Autonomous Driving

Yuan Chen, Zi-han Ding, Ziqin Wang et al.

ECCV 2024arXiv:2406.14556

citations

#366

SemGrasp: Semantic Grasp Generation via Language Aligned Discretization

Kailin Li, Jingbo Wang, Lixin Yang et al.

ECCV 2024arXiv:2404.03590

citations

#367

Boosting Transferability in Vision-Language Attacks via Diversification along the Intersection Region of Adversarial Trajectory

Sensen Gao, Xiaojun Jia, Xuhong Ren et al.

ECCV 2024arXiv:2403.12445

citations

#368

Generalizable Human Gaussians for Sparse View Synthesis

Youngjoong Kwon, Baole Fang, Yixing Lu et al.

ECCV 2024arXiv:2407.12777

citations

#369

Adversarial Prompt Tuning for Vision-Language Models

Jiaming Zhang, Xingjun Ma, Xin Wang et al.

ECCV 2024arXiv:2311.11261

citations

#370

Ray Denoising: Depth-aware Hard Negative Sampling for Multi-view 3D Object Detection

Feng Liu, Tengteng Huang, Qianjing Zhang et al.

ECCV 2024arXiv:2402.03634

citations

#371

NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation

Jingyang Huo, Yikai Wang, Yanwei Fu et al.

ECCV 2024arXiv:2403.18211

citations

#372

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Byung-Kwan Lee, Beomchan Park, Chae Won Kim et al.

ECCV 2024arXiv:2403.07508

citations

#373

LingoQA: Video Question Answering for Autonomous Driving

Ana-Maria Marcu, Long Chen, Jan Hünermann et al.

ECCV 2024

citations

#374

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Yi Wu, Ziqiang Li, Heliang Zheng et al.

ECCV 2024arXiv:2403.11781

citations

#375

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

Kaishen Yuan, Zitong Yu, Xin Liu et al.

ECCV 2024arXiv:2403.04697

citations

#376

Disentangled Clothed Avatar Generation from Text Descriptions

Jionghao Wang, Yuan Liu, Zhiyang Dou et al.

ECCV 2024arXiv:2312.05295

citations

#377

Audio-Synchronized Visual Animation

Lin Zhang, Shentong Mo, Yijing Zhang et al.

ECCV 2024arXiv:2403.05659

citations

#378

XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution

Yunpeng Qu, Kun Yuan, Kai Zhao et al.

ECCV 2024arXiv:2403.05049

citations

#379

Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

I-HSIANG CHEN, Wei-Ting Chen, Yu-Wei Liu et al.

ECCV 2024arXiv:2405.10589

citations

#380

Detecting As Labeling: Rethinking LiDAR-camera Fusion in 3D Object Detection

Junjie Huang, Yun Ye, Zhujin Liang et al.

ECCV 2024arXiv:2311.07152

citations

#381

WeConvene: Learned Image Compression with Wavelet-Domain Convolution and Entropy Model

Haisheng Fu, Jie Liang, Zhenman Fang et al.

ECCV 2024arXiv:2407.09983

citations

#382

Hierarchical Temporal Context Learning for Camera-based Semantic Scene Completion

Bohan Li, Jiajun Deng, Wenyao Zhang et al.

ECCV 2024arXiv:2407.02077

citations

#383

GTP-4o: Modality-prompted Heterogeneous Graph Learning for Omni-modal Biomedical Representation

Chenxin Li, Xinyu Liu, Cheng Wang et al.

ECCV 2024arXiv:2407.05540

citations

#384

Label-anticipated Event Disentanglement for Audio-Visual Video Parsing

Jinxing Zhou, Dan Guo, Yuxin Mao et al.

ECCV 2024arXiv:2407.08126

citations

#385

Skeleton Recall Loss for Connectivity Conserving and Resource Efficient Segmentation of Thin Tubular Structures

Yannick Kirchhoff, Maximilian Rokuss, Saikat Roy et al.

ECCV 2024arXiv:2404.03010

citations

#386

Benchmarks and Challenges in Pose Estimation for Egocentric Hand Interactions with Objects

Zicong Fan, Takehiko Ohkawa, Linlin Yang et al.

ECCV 2024arXiv:2403.16428

citations

#387

AnimatableDreamer: Text-Guided Non-rigid 3D Model Generation and Reconstruction with Canonical Score Distillation

Xinzhou Wang, Yikai Wang, junliang ye et al.

ECCV 2024arXiv:2312.03795

citations

#388

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal

YUXIN WANG, Qianyi Wu, Guofeng Zhang et al.

ECCV 2024arXiv:2404.13679

citations

#389

HowToCaption: Prompting LLMs to Transform Video Annotations at Scale

Nina Shvetsova, Anna Kukleva, Xudong Hong et al.

ECCV 2024arXiv:2310.04900

citations

#390

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

Bowen Zhang, Yiji Cheng, Chunyu Wang et al.

ECCV 2024arXiv:2407.06938

citations

#391

R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Ye Liu, Jixuan He, Wanhua Li et al.

ECCV 2024arXiv:2404.00801

citations

#392

Exact Diffusion Inversion via Bidirectional Integration Approximation

Guoqiang Zhang, j.p. lewis, W. Bastiaan Kleijn

ECCV 2024

citations

#393

Lossy Image Compression with Foundation Diffusion Models

Lucas Relic, Roberto Azevedo, Markus Gross et al.

ECCV 2024arXiv:2404.08580

citations

#394

RegionDrag: Fast Region-Based Image Editing with Diffusion Models

Jingyi Lu, Xinghui Li, Kai Han

ECCV 2024arXiv:2407.18247

citations

#395

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024arXiv:2312.08372

citations

#396

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia et al.

ECCV 2024arXiv:2407.20337

citations

#397

ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling

Siming Yan, Min Bai, Weifeng Chen et al.

ECCV 2024arXiv:2402.06118

citations

#398

Beyond Prompt Learning: Continual Adapter for Efficient Rehearsal-Free Continual Learning

XINYUAN GAO, Songlin Dong, Yuhang He et al.

ECCV 2024arXiv:2407.10281

citations

#399

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

ye junyan, Zhutao Lv, Li Weijia et al.

ECCV 2024arXiv:2408.05475

citations

#400

Expanding Scene Graph Boundaries: Fully Open-vocabulary Scene Graph Generation via Visual-Concept Alignment and Retention

Zuyao Chen, Jinlin Wu, Zhen Lei et al.

ECCV 2024arXiv:2311.10988

citations

← Previous

1 2 3 4...12