Most Cited ICCV "domain-specific datasets" Papers

2,701 papers found • Page 11 of 14

#2001

Straighten Viscous Rectified Flow via Noise Optimization

Jimin Dai, Jiexi Yan, Jian Yang et al.

ICCV 2025highlightarXiv:2507.10218
#2002

CRAM: Large Scale Video Continual Learning with Bootstrapped Compression

Shivani Mall, Joao F. Henriques

ICCV 2025posterarXiv:2508.05001
#2003

VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation

Shoubin Yu, Difan Liu, Ziqiao Ma et al.

ICCV 2025posterarXiv:2503.14350
#2004

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

Yi Liu, Shengqian Li, Zuzeng Lin et al.

ICCV 2025posterarXiv:2506.23347
#2005

Edicho: Consistent Image Editing in the Wild

Qingyan Bai, Hao Ouyang, Yinghao Xu et al.

ICCV 2025posterarXiv:2412.21079
#2006

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025posterarXiv:2508.10407
#2007

IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Khaled Abud, Sergey Lavrushkin, Alexey Kirillov et al.

ICCV 2025highlightarXiv:2412.01794
#2008

Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion

Jiwon Kim, Pureum Kim, SeonHwa Kim et al.

ICCV 2025posterarXiv:2508.09575
#2009

Anti-Tamper Protection for Unauthorized Individual Image Generation

Zelin Li, Ruohan Zong, Yifan Liu et al.

ICCV 2025posterarXiv:2508.06325
#2010

Continual Personalization for Diffusion Models

Yu-Chien Liao, Jr-Jen Chen, Chi-Pin Huang et al.

ICCV 2025posterarXiv:2510.02296
#2011

Global and Local Entailment Learning for Natural World Imagery

Srikumar Sastry, Aayush Dhakal, Eric Xing et al.

ICCV 2025posterarXiv:2506.21476
#2012

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Zhu Xu, Ting Lei, Zhimin Li et al.

ICCV 2025posterarXiv:2508.04943
#2013

Magic Insert: Style-Aware Drag-and-Drop

Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa et al.

ICCV 2025highlightarXiv:2407.02489
#2014

CharaConsist: Fine-Grained Consistent Character Generation

Mengyu Wang, Henghui Ding, Jianing Peng et al.

ICCV 2025posterarXiv:2507.11533
#2015

Beyond Perspective: Neural 360-Degree Video Compression

Andy Regensky, Marc Windsheimer, Fabian Brand et al.

ICCV 2025poster
#2016

One-Step Specular Highlight Removal with Adapted Diffusion Models

Mahir Atmis, LEVENT KARACAN, Mehmet SARIGÜL

ICCV 2025poster
#2017

DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting

Jingyi Pan, Dan Xu, Qiong Luo

ICCV 2025posterarXiv:2507.00429
#2018

From Linearity to Non-Linearity: How Masked Autoencoders Capture Spatial Correlations

Anthony Bisulco, Rahul Ramesh, Randall Balestriero et al.

ICCV 2025posterarXiv:2508.15404
#2019

FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Minghan LI, Chenxi Xie, Yichen Wu et al.

ICCV 2025poster
#2020

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

Bowen Fu, Wei Wei, Jiaqi Tang et al.

ICCV 2025poster
#2021

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Haonan Qiu, Shiwei Zhang, Yujie Wei et al.

ICCV 2025posterarXiv:2412.09626
#2022

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

Runtao Liu, I Chen, Jindong Gu et al.

ICCV 2025poster
#2023

FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models

Yuxuan Wang, Tianwei Cao, Huayu Zhang et al.

ICCV 2025posterarXiv:2507.02714
#2024

Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles

Eric Slyman, Mehrab Tanjim, Kushal Kafle et al.

ICCV 2025posterarXiv:2509.08777
#2025

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025posterarXiv:2510.14230
#2026

Streamlining Image Editing with Layered Diffusion Brushes

Peyman Gholami, Robert Xiao

ICCV 2025posterarXiv:2405.00313
#2027

ArtEditor: Learning Customized Instructional Image Editor from Few-Shot Examples

Shijie Huang, Yiren Song, Yuxuan Zhang et al.

ICCV 2025poster
#2028

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025posterarXiv:2503.07661
#2029

A3GS: Arbitrary Artistic Style into Arbitrary 3D Gaussian Splatting

Zhiyuan Fang, Rengan Xie, Xuancheng Jin et al.

ICCV 2025poster
#2030

HouseTour: A Virtual Real Estate A(I)gent

Ata Çelen, Iro Armeni, Daniel Barath et al.

ICCV 2025posterarXiv:2510.18054
#2031

Free2Guide: Training-Free Text-to-Video Alignment using Image LVLM

Jaemin Kim, Bryan Sangwoo Kim, Jong Ye

ICCV 2025poster
#2032

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE

Yazhou Xing, Yang Fei, Yingqing He et al.

ICCV 2025poster
#2033

DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models

SeungHoo Hong, GeonHo Son, Juhun Lee et al.

ICCV 2025posterarXiv:2510.00778
#2034

GFPack++: Attention-Driven Gradient Fields for Optimizing 2D Irregular Packing

Tianyang Xue, Lin Lu, Yang Liu et al.

ICCV 2025highlight
#2035

Preserve Anything: Controllable Image Synthesis with Object Preservation

Prasen Kumar Sharma, Neeraj Matiyali, Siddharth Srivastava et al.

ICCV 2025posterarXiv:2506.22531
#2036

Anchor Token Matching: Implicit Structure Locking for Training-free AR Image Editing

Taihang Hu, Linxuan Li, Kai Wang et al.

ICCV 2025posterarXiv:2504.10434
#2037

EEGMirror: Leveraging EEG data in the wild via Montage-Agnostic Self-Supervision for EEG to Video Decoding

Xuan-Hao Liu, Bao-liang Lu, Wei-Long Zheng

ICCV 2025poster
#2038

Accelerating Diffusion Sampling via Exploiting Local Transition Coherence

shangwen zhu, Han Zhang, Zhantao Yang et al.

ICCV 2025posterarXiv:2503.09675
#2039

SA-LUT: Spatial Adaptive 4D Look-Up Table for Photorealistic Style Transfer

Zerui Gong, Zhonghua Wu, Qingyi Tao et al.

ICCV 2025posterarXiv:2506.13465
#2040

UniGlyph: Unified Segmentation-Conditioned Diffusion for Precise Visual Text Synthesis

Yuanrui Wang, Cong Han, Yafei Li et al.

ICCV 2025posterarXiv:2507.00992
#2041

Semantic Discrepancy-aware Detector for Image Forgery Identification

Wang Ziye, Minghang Yu, Chunyan Xu et al.

ICCV 2025posterarXiv:2508.12341
#2042

REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder

Yitian Zhang, Long Mai, Aniruddha Mahapatra et al.

ICCV 2025posterarXiv:2503.08665
#2043

Gain-MLP: Improving HDR Gain Map Encoding via a Lightweight MLP

Trevor Canham, SaiKiran Tedla, Michael Murdoch et al.

ICCV 2025posterarXiv:2503.11883
#2044

Less-to-More Generalization: Unlocking More Controllability by In-Context Generation

shaojin wu, Mengqi Huang, wenxu wu et al.

ICCV 2025posterarXiv:2504.02160
#2045

FlexGen: Flexible Multi-View Generation from Text and Image Inputs

Xinli Xu, Wenhang Ge, Jiantao Lin et al.

ICCV 2025posterarXiv:2410.10745
#2046

Teleportraits: Training-Free People Insertion into Any Scene

Jialu Gao, Joseph K J, Fernando De la Torre

ICCV 2025posterarXiv:2510.05660
#2047

DiT4SR: Taming Diffusion Transformer for Real-World Image Super-Resolution

Zheng-Peng Duan, jiawei zhang, Xin Jin et al.

ICCV 2025posterarXiv:2503.23580
#2048

QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing

Tiancheng SHEN, Jun Hao Liew, Zilong Huang et al.

ICCV 2025poster
#2049

Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Rongkun Xue, Jinouwen Zhang, Yazhe Niu et al.

ICCV 2025posterarXiv:2412.01787
#2050

Beyond Brain Decoding: Visual-Semantic Reconstructions to Mental Creation Extension Based on fMRI

Haodong Jing, Dongyao Jiang, Yongqiang Ma et al.

ICCV 2025poster
#2051

ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

KA WONG, Jicheng Zhou, Haiwei Wu et al.

ICCV 2025posterarXiv:2507.16397
#2052

Generative Video Bi-flow

Chen Liu, Tobias Ritschel

ICCV 2025posterarXiv:2503.06364
#2053

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

Moayed Haji-Ali, Willi Menapace, Aliaksandr Siarohin et al.

ICCV 2025posterarXiv:2412.15191
#2054

JPEG Processing Neural Operator for Backward-Compatible Coding

Woo Kyoung Han, Yongjun Lee, Byeonghun Lee et al.

ICCV 2025posterarXiv:2507.23521
#2055

All Parts Matter: A Unified Mask-Free Virtual Try-On Framework

Chenghu Du, Shengwu Xiong, Yi Rong

ICCV 2025poster
#2056

DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space

Junyu Chen, Dongyun Zou, Wenkun He et al.

ICCV 2025posterarXiv:2508.00413
#2057

MH-LVC: Multi-Hypothesis Temporal Prediction for Learned Conditional Residual Video Coding

Gao Zong lin, Huu-Tai Phung, Yi-Chen Yao et al.

ICCV 2025posterarXiv:2510.12479
#2058

An Efficient Hybrid Vision Transformer for TinyML Applications

Fanhong Zeng, Huanan LI, Juntao Guan et al.

ICCV 2025poster
#2059

Graph Domain Adaptation with Dual-branch Encoder and Two-level Alignment for Whole Slide Image-based Survival Prediction

Yuntao Shou, Xiangyong Cao, PeiqiangYan PeiqiangYan et al.

ICCV 2025posterarXiv:2411.14001
#2060

Multi-Schema Proximity Network for Composed Image Retrieval

Jiangming Shi, Xiangbo Yin, yeyunchen yeyunchen et al.

ICCV 2025poster
#2061

Moment Quantization for Video Temporal Grounding

Xiaolong Sun, Le Wang, Sanping Zhou et al.

ICCV 2025posterarXiv:2504.02286
#2062

SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition

Yongkun Du, Zhineng Chen, Hongtao Xie et al.

ICCV 2025posterarXiv:2411.15858
#2063

ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation

Cihang Peng, Qiming HOU, Zhong Ren et al.

ICCV 2025posterarXiv:2508.01008
#2064

Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation

Shuo Jin, Siyue Yu, Bingfeng Zhang et al.

ICCV 2025highlight
#2065

DiffPS: Leveraging Prior Knowledge of Diffusion Model for Person Search

Giyeol Kim, Sooyoung Yang, Jihyong Oh et al.

ICCV 2025highlight
#2066

LaCoOT: Layer Collapse through Optimal Transport

Victor Quétu, Zhu LIAO, Nour Hezbri et al.

ICCV 2025posterarXiv:2406.08933
#2067

Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang et al.

ICCV 2025posterarXiv:2507.21703
#2068

Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding

Yuanhan Zhang, Yunice Chew, Yuhao Dong et al.

ICCV 2025posterarXiv:2507.15028
#2069

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

Xin Shen, Xinyu Wang, Lei Shen et al.

ICCV 2025poster
#2070

Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction

Zeren Jiang, Chuanxia Zheng, Iro Laina et al.

ICCV 2025highlightarXiv:2504.07961
#2071

On the Recovery of Cameras from Fundamental Matrices

Rakshith Madhavan, Federica Arrigoni

ICCV 2025highlight
#2072

The Devil is in the Spurious Correlations: Boosting Moment Retrieval with Dynamic Learning

Xinyang Zhou, Fanyue Wei, Lixin Duan et al.

ICCV 2025posterarXiv:2501.07305
#2073

SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images

Yichi Zhang, Le Xue, Wenbo zhang et al.

ICCV 2025posterarXiv:2502.14351
#2074

Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing

Jeongmin Yu, Susang Kim, Kisu Lee et al.

ICCV 2025posterarXiv:2509.06336
#2075

4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding

Wenxuan Zhu, Bing Li, Cheng Zheng et al.

ICCV 2025posterarXiv:2503.17827
#2076

SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images

Shuhang Chen, Hangjie Yuan, Pengwei Liu et al.

ICCV 2025posterarXiv:2511.08626
#2077

Text-guided Visual Prompt DINO for Generic Segmentation

Yuchen Guan, Chong Sun, Canmiao Fu et al.

ICCV 2025posterarXiv:2508.06146
#2078

STDDNet: Harnessing Mamba for Video Polyp Segmentation via Spatial-aligned Temporal Modeling and Discriminative Dynamic Representation Learning

Guilian Chen, Huisi Wu, Jing Qin

ICCV 2025poster
#2079

MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs

Jiawei Mao, Yuhan Wang, Yucheng Tang et al.

ICCV 2025posterarXiv:2504.06897
#2080

Few-Shot Pattern Detection via Template Matching and Regression

Eunchan Jo, Dahyun Kang, Sanghyun Kim et al.

ICCV 2025highlightarXiv:2508.17636
#2081

Exploring Probabilistic Modeling Beyond Domain Generalization for Semantic Segmentation

I-Hsiang Chen, Hua-En Chang, Wei-Ting Chen et al.

ICCV 2025posterarXiv:2507.21367
#2082

Bridging the Gap between Brain and Machine in Interpreting Visual Semantics: Towards Self-adaptive Brain-to-Text Decoding

Jiaxuan Chen, Yu Qi, Yueming Wang et al.

ICCV 2025poster
#2083

DisTime: Distribution-based Time Representation for Video Large Language Models

yingsen zeng, Zepeng Huang, Yujie Zhong et al.

ICCV 2025posterarXiv:2505.24329
#2084

WeaveSeg: Iterative Contrast-weaving and Spectral Feature-refining for Nuclei Instance Segmentation

Jiajia Li, Huisi Wu, Jing Qin

ICCV 2025highlight
#2085

CARIM: Caption-Based Autonomous Driving Scene Retrieval via Inclusive Text Matching

Minjoo Ki, Dae Jung Kim, Kisung Kim et al.

ICCV 2025poster
#2086

Modeling Saliency Dataset Bias

Matthias Kümmerer, Harneet Singh Khanuja, Matthias Bethge

ICCV 2025highlightarXiv:2505.10169
#2087

Advancing Visual Large Language Model for Multi-granular Versatile Perception

Wentao Xiang, Haoxian Tan, Cong Wei et al.

ICCV 2025posterarXiv:2507.16213
#2088

Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion

Yuan Bian, Min Liu, Yunqi Yi et al.

ICCV 2025posterarXiv:2502.19697
#2089

DIH-CLIP: Unleashing the Diversity of Multi-Head Self-Attention for Training-Free Open-Vocabulary Semantic Segmentation

Songsong Duan, Xi Yang, Nannan Wang

ICCV 2025poster
#2090

Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration

Mark Endo, Xiaohan Wang, Serena Yeung-Levy

ICCV 2025posterarXiv:2412.13180
#2091

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Yuzhang Shang, Mu Cai, Bingxin Xu et al.

ICCV 2025posterarXiv:2403.15388
#2092

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?

Yusen Zhang, Wenliang Zheng, Aashrith Madasu et al.

ICCV 2025posterarXiv:2504.18406
#2093

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Yufei Zhan, Shurong Zheng, Yousong Zhu et al.

ICCV 2025posterarXiv:2403.09333
#2094

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang et al.

ICCV 2025posterarXiv:2410.23287
#2095

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

Yucheng Suo, Fan Ma, Linchao Zhu et al.

ICCV 2025posterarXiv:2503.20472
#2096

VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification

Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.

ICCV 2025poster
#2097

HarmonySeg: Tubular Structure Segmentation with Deep-Shallow Feature Fusion and Growth-Suppression Balanced Loss

Ke Zhang, Yi Huang, Wei Liu et al.

ICCV 2025posterarXiv:2504.07827
#2098

SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting

Shuaiting Li, Juncan Deng, Chengxuan Wang et al.

ICCV 2025posterarXiv:2503.08668
#2099

VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization

Xinye Cao, Hongcan Guo, Jiawen Qian et al.

ICCV 2025posterarXiv:2510.06040
#2100

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

Qi Chen, Xinze Zhou, Chen Liu et al.

ICCV 2025posterarXiv:2510.14831
#2101

Region-aware Anchoring Mechanism for Efficient Referring Visual Grounding

Shuyi Ouyang, Ziwei Niu, Hongyi Wang et al.

ICCV 2025poster
#2102

MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation

Bin Xie, Hao Tang, Bin Duan et al.

ICCV 2025poster
#2103

MEH: A Multi-Style Dataset and Toolkit for Advancing Egyptian Hieroglyph Recognition

Maksim Golyadkin, Rubanova Alexandrovna, Aleksandr Utkov et al.

ICCV 2025poster
#2104

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

Zhuqiang Lu, Zhenfei Yin, Mengwei He et al.

ICCV 2025posterarXiv:2412.09919
#2105

DiffTell: A High-Quality Dataset for Describing Image Manipulation Changes

Zonglin Di, Jing Shi, Yifei Fan et al.

ICCV 2025poster
#2106

HyperGCT: A Dynamic Hyper-GNN-Learned Geometric Constraint for 3D Registration

Xiyu Zhang, Jiayi Ma, Jianwei Guo et al.

ICCV 2025posterarXiv:2503.02195
#2107

AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving

Jiawei Xu, Kai Deng, Zexin Fan et al.

ICCV 2025posterarXiv:2507.12137
#2108

PossLoss: A Reliable and Sensitive Facial Landmark Detection Loss Function

Qikui Zhu

ICCV 2025poster
#2109

RESCUE: Crowd Evacuation Simulation via Controlling SDM-United Characters

Xiaolin Liu, Tianyi zhou, Hongbo Kang et al.

ICCV 2025highlightarXiv:2507.20117
#2110

SG-LDM: Semantic-Guided LiDAR Generation via Latent-Aligned Diffusion

Zhengkang Xiang, Zizhao Li, Amir Khodabandeh et al.

ICCV 2025posterarXiv:2506.23606
#2111

PointGAC: Geometric-Aware Codebook for Masked Point Modeling

Abiao Li, Chenlei Lv, Guofeng Mei et al.

ICCV 2025poster
#2112

PRM: Photometric Stereo based Large Reconstruction Model

Wenhang Ge, Jiantao Lin, Guibao SHEN et al.

ICCV 2025highlightarXiv:2412.07371
#2113

RobuSTereo: Robust Zero-Shot Stereo Matching under Adverse Weather

Yuran Wang, Yingping Liang, Yutao Hu et al.

ICCV 2025posterarXiv:2507.01653
#2114

Gaussian-based World Model: Gaussian Priors for Voxel-Based Occupancy Prediction and Future Motion Prediction

Tuo Feng, Wenguan Wang, Yi Yang

ICCV 2025poster
#2115

Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction

JIXUAN FAN, Wanhua Li, Yifei Han et al.

ICCV 2025posterarXiv:2412.04887
#2116

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

Haonan Han, Rui Yang, Huan Liao et al.

ICCV 2025posterarXiv:2405.18525
#2117

Towards Safer and Understandable Driver Intention Prediction

Mukilan Karuppasamy, Shankar Gangisetty, Shyam Nandan Rai et al.

ICCV 2025posterarXiv:2510.09200
#2118

High-Precision 3D Measurement of Complex Textured Surfaces Using Multiple Filtering Approach

Yuchong Chen, Jian Yu, Shaoyan Gai et al.

ICCV 2025poster
#2119

Resonance: Learning to Predict Social-Aware Pedestrian Trajectories as Co-Vibrations

Conghao Wong, Ziqian Zou, Beihao Xia

ICCV 2025posterarXiv:2412.02447
#2120

InsideOut: Integrated RGB-Radiative Gaussian Splatting for Comprehensive 3D Object Representation

Jungmin Lee, Seonghyuk Hong, Juyong Lee et al.

ICCV 2025posterarXiv:2510.17864
#2121

CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving

Changxing Liu, Genjia Liu, Zijun Wang et al.

ICCV 2025posterarXiv:2503.08683
#2122

Mitigating Geometric Degradation in Fast DownSampling via FastAdapter for Point Cloud Segmentation

Shuofeng Sun, Haibin Yan

ICCV 2025poster
#2123

DoppDrive: Doppler-Driven Temporal Aggregation for Improved Radar Object Detection

Yuval Haitman, Oded Bialer

ICCV 2025posterarXiv:2508.12330
#2124

MDP-Omni: Parameter-free Multimodal Depth Prior-based Sampling for Omnidirectional Stereo Matching

Eunjin Son, HyungGi Jo, Wookyong Kwon et al.

ICCV 2025poster
#2125

EDM: Efficient Deep Feature Matching

Xi Li, Tong Rao, Cihui Pan

ICCV 2025highlightarXiv:2503.05122
#2126

Occupancy Learning with Spatiotemporal Memory

Ziyang Leng, Jiawei Yang, Wenlong Yi et al.

ICCV 2025posterarXiv:2508.04705
#2127

ACE-G: Improving Generalization of Scene Coordinate Regression Through Query Pre-Training

Leonard Bruns, Axel Barroso-Laguna, Tommaso Cavallari et al.

ICCV 2025posterarXiv:2510.11605
#2128

Towards Visual Localization Interoperability: Cross-Feature for Collaborative Visual Localization and Mapping

Alberto Jaenal, Paula Carbó Cubero, Jose Araujo et al.

ICCV 2025poster
#2129

Explaining Human Preferences via Metrics for Structured 3D Reconstruction

Jack Langerman, Denis Rozumny, Yuzhong Huang et al.

ICCV 2025highlightarXiv:2503.08208
#2130

Inverse 3D Microscopy Rendering for Cell Shape Inference with Active Mesh

Sacha Ichbiah, Anshuman Sinha, Fabrice Delbary et al.

ICCV 2025highlightarXiv:2303.10440
#2131

Bridging 3D Anomaly Localization and Repair via High-Quality Continuous Geometric Representation

Bozhong Zheng, Jinye Gan, Xiaohao Xu et al.

ICCV 2025posterarXiv:2505.24431
#2132

SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching

Xiangzeng Liu, CHI WANG, Guanglu Shi et al.

ICCV 2025highlightarXiv:2508.02278
#2133

Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors

Katja Schwarz, Norman Müller, Peter Kontschieder

ICCV 2025posterarXiv:2503.13272
#2134

Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction

Zhirui Gao, Renjiao Yi, YaQiao Dai et al.

ICCV 2025posterarXiv:2506.21401
#2135

Tree Skeletonization from 3D Point Clouds by Denoising Diffusion

Elias Marks, Lucas Nunes, Federico Magistri et al.

ICCV 2025poster
#2136

Neural Inverse Rendering for High-Accuracy 3D Measurement of Moving Objects with Fewer Phase-Shifting Patterns

Yuki Urakawa, Yoshihiro Watanabe

ICCV 2025poster
#2137

When Anchors Meet Cold Diffusion: A Multi-Stage Approach to Lane Detection

Bo-Lun Huang, Tzu-Hsiang Ni, Feng-Kai Huang et al.

ICCV 2025poster
#2138

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Tongyan Hua, Lutao Jiang, Ying-Cong Chen et al.

ICCV 2025posterarXiv:2507.04403
#2139

NeuFrameQ: Neural Frame Fields for Scalable and Generalizable Anisotropic Quadrangulation

Ying-Tian Liu, Jiajun Li, Yu-Tao Liu et al.

ICCV 2025highlight
#2140

Controllable 3D Outdoor Scene Generation via Scene Graphs

Yuheng Liu, Xinke Li, Yuning Zhang et al.

ICCV 2025posterarXiv:2503.07152
#2141

PolGS: Polarimetric Gaussian Splatting for Fast Reflective Surface Reconstruction

Yufei Han, Bowen Tie, Heng Guo et al.

ICCV 2025posterarXiv:2509.19726
#2142

Driving View Synthesis on Free-form Trajectories with Generative Prior

Zeyu Yang, Zijie Pan, Yuankun Yang et al.

ICCV 2025posterarXiv:2412.01717
#2143

CVFusion: Cross-View Fusion of 4D Radar and Camera for 3D Object Detection

Hanzhi Zhong, Zhiyu Xiang, Ruoyu Xu et al.

ICCV 2025posterarXiv:2507.04587
#2144

MAESTRO: Task-Relevant Optimization via Adaptive Feature Enhancement and Suppression for Multi-task 3D Perception

ChangWon Kang, Jisong Kim, Hongjae Shin et al.

ICCV 2025posterarXiv:2509.17462
#2145

Joint Semantic and Rendering Enhancements in 3D Gaussian Modeling with Anisotropic Local Encoding

Jingming He, Chongyi Li, Shiqi Wang et al.

ICCV 2025posterarXiv:2601.02339
#2146

DCHM: Depth-Consistent Human Modeling for Multiview Detection

Jiahao Ma, Tianyu Wang, Miaomiao Liu et al.

ICCV 2025posterarXiv:2507.14505
#2147

V2XScenes: A Multiple Challenging Traffic Conditions Dataset for Large-Range Vehicle-Infrastructure Collaborative Perception

Bowen Wang, Yafei Wang, Wei Gong et al.

ICCV 2025poster
#2148

Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis

Junyan Ye, Jun He, Weijia Li et al.

ICCV 2025posterarXiv:2408.01812
#2149

EMD: Explicit Motion Modeling for High-Quality Street Gaussian Splatting

Xiaobao Wei, Qingpo Wuwu, Zhongyu Zhao et al.

ICCV 2025posterarXiv:2411.15582
#2150

Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning

Giwon Lee, Wooseong Jeong, Daehee Park et al.

ICCV 2025highlightarXiv:2507.04790
#2151

Communication-Efficient Multi-Vehicle Collaborative Semantic Segmentation via Sparse 3D Gaussian Sharing

Tianyu Hong, Xiaobo Zhou, Wenkai Hu et al.

ICCV 2025poster
#2152

DATA: Domain-And-Time Alignment for High-Quality Feature Fusion in Collaborative Perception

Chengchang Tian, Jianwei Ma, Yan Huang et al.

ICCV 2025posterarXiv:2507.18237
#2153

Heatmap Regression without Soft-Argmax for Facial Landmark Detection

Chiao-An Yang, Raymond A. Yeh

ICCV 2025posterarXiv:2508.14929
#2154

Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration

Katie Luo, Minh-Quan Dao, Zhenzhen Liu et al.

ICCV 2025posterarXiv:2502.14156
#2155

Puzzle Similarity: A Perceptually-guided Cross-Reference Metric for Artifact Detection in 3D Scene Reconstructions

Nicolai Hermann, Jorge Condor, Piotr Didyk

ICCV 2025posterarXiv:2411.17489
#2156

Authentic 4D Driving Simulation with a Video Generation Model

Lening Wang, Wenzhao Zheng, Dalong Du et al.

ICCV 2025poster
#2157

Spherical Epipolar Rectification for Deep Two-View Absolute Depth Estimation

Pierre-André Brousseau, Sébastien Roy

ICCV 2025poster
#2158

Leveraging 2D Priors and SDF Guidance for Urban Scene Rendering

Siddharth Tourani, Jayaram Reddy, Akash Kumbar et al.

ICCV 2025poster
#2159

Super Resolved Imaging with Adaptive Optics

Robin Swanson, Esther Y. H. Lin, Masen Lamb et al.

ICCV 2025highlightarXiv:2508.04648
#2160

Knowledge Distillation for Learned Image Compression

Yunuo Chen, Zezheng Lyu, Bing He et al.

ICCV 2025poster
#2161

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Jianhong Bai, Menghan Xia, Xiao Fu et al.

ICCV 2025posterarXiv:2503.11647
#2162

Diving into the Fusion of Monocular Priors for Generalized Stereo Matching

Chengtang Yao, Lidong Yu, Zhidan Liu et al.

ICCV 2025posterarXiv:2505.14414
#2163

ROAR: Reducing Inversion Error in Generative Image Watermarking

Hanyi Wang, Han Fang, Shi-Lin Wang et al.

ICCV 2025poster
#2164

Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability

Seungju Yoo, Hyuk Kwon, Joong-Won Hwang et al.

ICCV 2025posterarXiv:2508.12082
#2165

LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing

Federico Girella, Davide Talon, Ziyue Liu et al.

ICCV 2025posterarXiv:2507.22627
#2166

FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas et al.

ICCV 2025posterarXiv:2412.08629
#2167

Event-based Visual Vibrometry

Xinyu Zhou, Peiqi Duan, Yeliduosi Xiaokaiti et al.

ICCV 2025poster
#2168

ObjectRelator: Enabling Cross-View Object Relation Understanding Across Ego-Centric and Exo-Centric Perspectives

Yuqian Fu, Runze Wang, Bin Ren et al.

ICCV 2025highlightarXiv:2411.19083
#2169

Scaling Transformer-Based Novel View Synthesis with Models Token Disentanglement and Synthetic Data

Nithin Gopalakrishnan Nair, Srinivas Kaza, Xuan Luo et al.

ICCV 2025poster
#2170

DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving

Chen Shi, Shaoshuai Shi, Kehua Sheng et al.

ICCV 2025posterarXiv:2505.19239
#2171

MamV2XCalib: V2X-based Target-less Infrastructure Camera Calibration with State Space Model

Yaoye Zhu, Zhe Wang, Yan Wang

ICCV 2025posterarXiv:2507.23595
#2172

PARTE: Part-Guided Texturing for 3D Human Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Gyeongsik Moon et al.

ICCV 2025posterarXiv:2507.17332
#2173

Boosting MLLM Reasoning with Text-Debiased Hint-GRPO

Qihan Huang, Weilong Dai, Jinlong Liu et al.

ICCV 2025posterarXiv:2503.23905
#2174

AirCache: Activating Inter-modal Relevancy KV Cache Compression for Efficient Large Vision-Language Model Inference

Kai Huang, hao zou, Bochen Wang et al.

ICCV 2025posterarXiv:2503.23956
#2175

FlowStyler: Artistic Video Stylization via Transformation Fields Transports

YuNing Gong, Jiaming Chen, Xiaohua Ren et al.

ICCV 2025poster
#2176

ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer

Jin Hu, Mingjia Li, Xiaojie Guo

ICCV 2025posterarXiv:2412.02545
#2177

Toward Fair and Accurate Cross-Domain Medical Image Segmentation: A VLM-Driven Active Domain Adaptation Paradigm

Hongqiu Wang, Wu Chen, Xiangde Luo et al.

ICCV 2025poster
#2178

Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion

Yidi Liu, Dong Li, Yuxin Ma et al.

ICCV 2025posterarXiv:2503.12764
#2179

BlueNeg: A 35mm Negative Film Dataset for Restoring Channel-Heterogeneous Deterioration

Hanyuan Liu, Chengze Li, Minshan Xie et al.

ICCV 2025poster
#2180

Rethinking Key-frame-based Micro-expression Recognition: A Robust and Accurate Framework Against Key-frame Errors

Zheyuan Zhang, Weihao Tang, Hong Chen

ICCV 2025highlightarXiv:2508.06640
#2181

What we need is explicit controllability: Training 3D gaze estimator using only facial images

Tingwei Li, Jun Bao, Zhenzhong Kuang et al.

ICCV 2025poster
#2182

SemiVisBooster: Boosting Semi-Supervised Learning for Fine-Grained Classification through Pseudo-Label Semantic Guidance

Wenjin Zhang, Xinyu Li, Chenyang Gao et al.

ICCV 2025poster
#2183

Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection

Xuehan Chen, Guangyu Ren, Tianhong Dai et al.

ICCV 2025poster
#2184

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.

ICCV 2025highlightarXiv:2503.22677
#2185

EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds

Lu Chen, Yizhou Wang, SHIXIANG TANG et al.

ICCV 2025posterarXiv:2502.05857
#2186

SIC: Similarity-Based Interpretable Image Classification with Neural Networks

Tom Nuno Wolf, Emre Kavak, Fabian Bongratz et al.

ICCV 2025posterarXiv:2501.17328
#2187

MambaML: Exploring State Space Models for Multi-Label Image Classification

Xuelin Zhu, Jian liu, Jiuxin Cao et al.

ICCV 2025poster
#2188

SEAL: Semantic Aware Image Watermarking

Kasra Arabi, R. Teal Witter, Chinmay Hegde et al.

ICCV 2025posterarXiv:2503.12172
#2189

Unsupervised Identification of Protein Compositions and Conformations via Implicit Content-Transformation Disentanglement

Mostofa Rafid Uddin, Jana Armouti, Min Xu

ICCV 2025poster
#2190

AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

Xuying Zhang, Yupeng Zhou, Kai Wang et al.

ICCV 2025poster
#2191

Memory-Efficient Generative Models via Product Quantization

Jie Shao, Hanxiao Zhang, Hao Yu et al.

ICCV 2025poster
#2192

Rethinking Discrete Tokens: Treating Them as Conditions for Continuous Autoregressive Image Synthesis

Peng Zheng, Junke Wang, Yi Chang et al.

ICCV 2025posterarXiv:2507.01756
#2193

CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement

Feixiang Wang, Shuang Yang, Shiguang Shan et al.

ICCV 2025poster
#2194

EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

Haokai Zhu, Bo Qu, Si-Yuan Cao et al.

ICCV 2025posterarXiv:2509.07662
#2195

Leveraging Debiased Cross-modal Attention Maps and Code-based Reasoning for Zero-shot Referring Expression Comprehension

Juntao Chen, Wen Shen, Zhihua Wei et al.

ICCV 2025poster
#2196

UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling

Peiming Li, Ziyi Wang, Yulin Yuan et al.

ICCV 2025posterarXiv:2508.14604
#2197

Automated Red Teaming for Text-to-Image Models through Feedback-Guided Prompt Iteration with Vision-Language Models

Wei Xu, Kangjie Chen, Jiawei Qiu et al.

ICCV 2025poster
#2198

Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation

Zhenhua Ning, Zhuotao Tian, Shaoshuai Shi et al.

ICCV 2025posterarXiv:2506.23120
#2199

BézierGS: Dynamic Urban Scene Reconstruction with Bézier Curve Gaussian Splatting

Zipei Ma, Junzhe Jiang, Yurui Chen et al.

ICCV 2025posterarXiv:2506.22099
#2200

CLIPSym: Delving into Symmetry Detection with CLIP

Tinghan Yang, Md Ashiqur Rahman, Raymond A. Yeh

ICCV 2025posterarXiv:2508.14197