ICCV Highlight Papers
263 papers found • Page 2 of 6
DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing
Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.
Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography
Jianing Zhang, Jiayi Zhu, Feiyu Ji et al.
Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology
Siyuan Yan, Ming Hu, Yiwen Jiang et al.
DexVLG: Dexterous Vision-Language-Grasp Model at Scale
Jiawei He, Danshi Li, Xinqiang Yu et al.
DiffPS: Leveraging Prior Knowledge of Diffusion Model for Person Search
Giyeol Kim, Sooyoung Yang, Jihyong Oh et al.
DiffRefine: Diffusion-based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection
Sangyun Shin, Yuhang He, Xinyu Hou et al.
DIMO: Diverse 3D Motion Generation for Arbitrary Objects
Linzhan Mou, Jiahui Lei, Chen Wang et al.
Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling
Qirui Wu, Denys Iliash, Daniel Ritchie et al.
Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Baoyou Chen, Ce Liu, Weihao Yuan et al.
Discontinuity-aware Normal Integration for Generic Central Camera Models
Francesco Milano, Manuel Lopez-Antequera, Naina Dhingra et al.
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding
Jungbin Cho, Junwan Kim, Jisoo Kim et al.
DisenQ: Disentangling Q-Former for Activity-Biometrics
Shehreen Azad, Yogesh Rawat
Disentangled Clothed Avatar Generation with Layered Representation
Weitian Zhang, Yichao Yan, Sijing Wu et al.
Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction
Luyao Tang, Kunze Huang, Yuxuan Yuan et al.
DLF: Extreme Image Compression with Dual-generative Latent Fusion
Naifu Xue, Zhaoyang Jia, Jiahao Li et al.
DreamLayer: Simultaneous Multi-Layer Generation via Diffusion Model
Junjia Huang, Pengxiang Yan, Jinhang Cai et al.
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang, Guoguang Du, Xiaochuan Li et al.
DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness
Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.
Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction
Edgar Sucar, Zihang Lai, Eldar Insafutdinov et al.
DynFaceRestore: Balancing Fidelity and Quality in Diffusion-Guided Blind Face Restoration with Dynamic Blur-Level Mapping and Guidance
Huu Phu Do, Yu-Wei Chen, Yi-Cheng Liao et al.
Edit360: 2D Image Edits to 3D Assets from Any Angle
Junchao Huang, Xinting Hu, Shaoshuai Shi et al.
EDM: Efficient Deep Feature Matching
Xi Li, Tong Rao, Cihui Pan
Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation
Shengfang ZHAI, Jiajun Li, Yue Liu et al.
Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions
Tommaso Galliena, Tommaso Apicella, Stefano Rosa et al.
Embodied VideoAgent: Persistent Memory from Egocentric Videos and Embodied Sensors Enables Dynamic Scene Understanding
Yue Fan, Xiaojian Ma, Rongpeng Su et al.
Emulating Self-attention with Convolution for Efficient Image Super-Resolution
Dongheon Lee, Seokju Yun, Youngmin Ro
Enrich and Detect: Video Temporal Grounding with Multimodal LLMs
Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.
Ensemble Foreground Management for Unsupervised Object Discovery
Ziling Wu, Armaghan Moemeni, Praminda Caleb-Solly
ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning
Jongseo Lee, Kyungho Bae, Kyle Min et al.
ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness
Boqian Li, Zeyu Cai, Michael Black et al.
Evading Data Provenance in Deep Neural Networks
Hongyu Zhu, Sichu Liang, Wenwen Wang et al.
EventUPS: Uncalibrated Photometric Stereo Using an Event Camera
Jinxiu Liang, Bohan Yu, Siqi Yang et al.
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models
Haiwen Diao, Xiaotong Li, Yufeng Cui et al.
Explaining Human Preferences via Metrics for Structured 3D Reconstruction
Jack Langerman, Denis Rozumny, Yuzhong Huang et al.
Exploring View Consistency for Scene-Adaptive Low-Light Light Field Image Enhancement
Shuo Zhang, Chen Gao, Youfang Lin
Fast Globally Optimal and Geometrically Consistent 3D Shape Matching
Paul Roetzer, Florian Bernard
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Lu Liu, Huiyu Duan, Qiang Hu et al.
Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation
Shuo Jin, Siyue Yu, Bingfeng Zhang et al.
Few-Shot Pattern Detection via Template Matching and Regression
Eunchan Jo, Dahyun Kang, Sanghyun Kim et al.
Find Any Part in 3D
Ziqi Ma, Yisong Yue, Georgia Gkioxari
Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training
Qiaosi Yi, Shuai Li, Rongyuan Wu et al.
FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution
Gene Chou, Wenqi Xian, Guandao Yang et al.
FlowR: Flowing from Sparse to Dense 3D Reconstructions
Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.
FPEM: Face Prior Enhanced Facial Attractiveness Prediction for Live Videos with Face Retouching
Hui Li, Xiaoyu Ren, Hongjiu Yu et al.
From Image to Video: An Empirical Study of Diffusion Representations
Pedro Vélez, Luisa Polania Cabrera, Yi Yang et al.
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
qiusheng huang, Xiaohui Zhong, Xu Fan et al.
GameFactory: Creating New Games with Generative Interactive Videos
Jiwen Yu, Yiran Qin, Xintao Wang et al.
GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting
Xiaobao Wei, Peng Chen, Guangyu Li et al.
GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology
Saarthak Kapse, Pushpak Pati, Srikar Yellapragada et al.
GENMO: A GENeralist Model for Human MOtion
Jiefeng Li, Jinkun Cao, Haotian Zhang et al.