Most Cited 2025 &quot;uncertainty-aware exploration&quot; Papers

#15802

Vertical Federated Feature Screening

Huajun Yin, Liyuan Wang, Yingqiu Zhu et al.

NEURIPS 2025arXiv:2508.19352

#15803

GenColor: Generative and Expressive Color Enhancement with Pixel-Perfect Texture Preservation

Yi Dong, Yuxi Wang, Xianhui Lin et al.

NEURIPS 2025spotlight

#15804

Low-degree evidence for computational transition of recovery rate in stochastic block model

Jingqiu Ding, Yiding Hua, Lucas Slot et al.

NEURIPS 2025spotlight

#15805

Memorization in Graph Neural Networks

Adarsh Jamadandi, Jing Xu, Adam Dziedzic et al.

#15806

ThermalGen: Style-Disentangled Flow-Based Generative Models for RGB-to-Thermal Image Translation

Jiuhong Xiao, Roshan Nayak, Ning Zhang et al.

NEURIPS 2025arXiv:2509.24878

#15807

Continuous Q-Score Matching: Diffusion Guided Reinforcement Learning for Continuous-Time Control

Chengxiu HUA, Jiawen Gu, Yushun Tang

NEURIPS 2025arXiv:2510.17122

#15808

From Kolmogorov to Cauchy: Shallow XNet Surpasses KANs

Xin Li, Xiaotao Zheng, Zhihong Xia

NEURIPS 2025oralarXiv:2510.25138

#15809

Learning Spatial-Aware Manipulation Ordering

Yuxiang Yan, Zhiyuan Zhou, Xin Gao et al.

#15810

MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks

WANTONG XIE, Yi-Xiang Hu, Jieyang Xu et al.

NEURIPS 2025arXiv:2510.05527

#15811

Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model

Yuyao Wang, Yu-Hung Cheng, Debarghya Mukherjee et al.

#15812

Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding

Pedro Hermosilla, Christian Stippel, Leon Sick

CVPR 2025arXiv:2504.06719

#15813

Streaming Federated Learning with Markovian Data

Khiem HUYNH, Malcolm Egan, Giovanni Neglia et al.

NEURIPS 2025arXiv:2503.18807

#15814

Efficient Verified Unlearning For Distillation

Yijun Quan, Zushu Li, Giovanni Montana

ICCV 2025arXiv:2502.06476

#15815

Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution

Vlad Hosu, Lorenzo Agnolucci, Daisuke Iso et al.

#15816

Neuromanifold-Regularized KANs for Shape-fair Feature Representations

Mazlum Arslan, Weihong Guo, Shuo Li

#15817

Less Static, More Private: Towards Transferable Privacy-Preserving Action Recognition by Generative Decoupled Learning

Zhi-Wei Xia, Kun-Yu Lin, Yuan-Ming Li et al.

#15818

AdaDCP: Learning an Adapter with Discrete Cosine Prior for Clear-to-Adverse Domain Generalization

Qi Bi, Yixian Shen, Jingjun Yi et al.

#15819

MorphoGen: Efficient Unconditional Generation of Long-Range Projection Neuronal Morphology via a Global-to-Local Framework

Tianfang Zhu, Hongyang Zhou, Anan LI

#15820

GaussianSpeech: Audio-Driven Personalized 3D Gaussian Avatars

Shivangi Aneja, Artem Sevastopolsky, Tobias Kirschstein et al.

ICCV 2025arXiv:2510.17181

#15821

Capturing head avatar with hand contacts from a monocular video

Haonan He, Yufeng Zheng, Jie Song

#15822

MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation

Sungwoo Cho, Jeongsoo Choi, Sungnyun Kim et al.

ICCV 2025arXiv:2503.11026

#15823

Privacy-centric Deep Motion Retargeting for Anonymization of Skeleton-Based Motion Visualization

Thomas Carr, Depeng Xu, Shuhan Yuan et al.

#15824

Highlight What You Want: Weakly-Supervised Instance-Level Controllable Infrared-Visible Image Fusion

Zeyu Wang, Jizheng Zhang, Haiyu Song et al.

ICCV 2025arXiv:2507.22731

#15825

GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation

Quanwei Yang, Luying Huang, Kaisiyuan Wang et al.

#15826

KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference

Rui Peng, Yuchen Lu, Qichen Sun et al.

NEURIPS 2025oralarXiv:2505.09664

#15827

Learning A Unified Template for Gait Recognition

Panjian Huang, Saihui Hou, Junzhou Huang et al.

#15828

ZFusion: Efficient Deep Compositional Zero-shot Learning for Blind Image Super-Resolution with Generative Diffusion Prior

Alireza Esmaeilzehi, Hossein Zaredar, Yapeng Tian et al.

#15829

FlowDPS : Flow-Driven Posterior Sampling for Inverse Problems

Jeongsol Kim, Bryan Sangwoo Kim, Jong Ye

#15830

HADES: Human Avatar with Dynamic Explicit Hair Strands

Zhanfeng Liao, Hanzhang Tu, Cheng Peng et al.

#15831

Unfolding-Associative Encoder-Decoder Network with Progressive Alignment for Pansharpening

Shijie Fang, Hongping Gan

#15832

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration

Tao Wang, Peiwen Xia, Bo Li et al.

ICCV 2025arXiv:2510.14025

#15833

NAPPure: Adversarial Purification for Robust Image Classification under Non-Additive Perturbations

Junjie Nan, Jianing Li, Wei Chen et al.

#15834

LOMM: Latest Object Memory Management for Temporally Consistent Video Instance Segmentation

Seunghun Lee, Jiwan Seo, Minwoo Choi et al.

#15835

EVDM: Event-based Real-world Video Deblurring with Mamba

Zhijing Sun, Senyan Xu, Kean Liu et al.

#15836

Q-Norm: Robust Representation Learning via Quality-Adaptive Normalization

Lanning Zhang, Ying Zhou, Fei Gao et al.

#15837

Proxy-Bridged Game Transformer for Interactive Extreme Motion Prediction

Yanwen Fang, Wenqi Jia, Xu Cao et al.

#15838

π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?

Susan Liang, Chao Huang, Yolo Yunlong Tang et al.

#15839

RobAVA: A Large-scale Dataset and Baseline Towards Video based Robotic Arm Action Understanding

Baoli Sun, Ning Wang, Xinzhu Ma et al.

#15840

VideoSetDiff: Identifying and Reasoning Similarities and Differences in Similar Videos

YUE QIU, Yanjun Sun, Takuma Yagi et al.

ICCV 2025arXiv:2508.12084

#15841

Generic Event Boundary Detection via Denoising Diffusion

Jaejun Hwang, Dayoung Gong, Manjin Kim et al.

#15842

Not All Degradations Are Equal: A Targeted Feature Denoising Framework for Generalizable Image Super-Resolution

hongjun wang, Jiyuan Chen, Zhengwei Yin et al.

ICCV 2025arXiv:2509.14841

#15843

Scaling Action Detection: AdaTAD++ with Transformer-Enhanced Temporal-Spatial Adaptation

Tanay Agrawal, Abid Ali, Antitza Dantcheva et al.

#15844

Fine-Grained 3D Gaussian Head Avatars Modeling from Static Captures via Joint Reconstruction and Registration

Yuan Sun, Xuan Wang, Cong Wang et al.

ICCV 2025arXiv:2503.08145

#15845

Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking

Yunhao Li, Yifan Jiao, Dan Meng et al.

#15846

MistSense: Versatile Online Detection of Procedural and Execution Mistakes

Constantin Patsch, Yuankai Wu, Marsil Zakour et al.

ICCV 2025arXiv:2412.14371

#15847

SEREP: Semantic Facial Expression Representation for Robust In-the-Wild Capture and Retargeting

Arthur Josi, Luiz Gustavo Hafemann, Abdallah Dib et al.

#15848

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation

Syed Talal Wasim, Hamid Suleman, Olga Zatsarynna et al.

ICCV 2025arXiv:2509.11394

#15849

GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule

Rui Wang, Yimu Sun, Jingxing Guo et al.

ICCV 2025arXiv:2512.10252

#15850

DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding

Thomas Kreutz, Max Mühlhäuser, Alejandro Sanchez Guinea

ICCV 2025arXiv:2506.13897

#15851

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135

#15852

TimeBooth: Disentangled Facial Invariant Representation for Diverse and Personalized Face Aging

Zepeng Su, zhulin liu, Zongyan Zhang et al.

ICCV 2025arXiv:2509.16968

#15853

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang et al.

#15854

Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

Zhengyao Lyu, Chenyang Si, Tianlin Pan et al.

ICCV 2025highlightarXiv:2507.10218

#15855

Straighten Viscous Rectified Flow via Noise Optimization

Jimin Dai, Jiexi Yan, Jian Yang et al.

#15856

Scalable Dual Fingerprinting for Hierarchical Attribution of Text-to-Image Models

Jianwei Fei, Yunshu Dai, Peipeng Yu et al.

ICCV 2025arXiv:2508.05001

#15857

CRAM: Large Scale Video Continual Learning with Bootstrapped Compression

Shivani Mall, Joao F. Henriques

#15858

Tree-NeRV: Efficient Non-Uniform Sampling for Neural Video Representation via Tree-Structured Feature Grids

Jiancheng Zhao, Yifan Zhan, Qingtian Zhu et al.

#15859

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

Nisha Huang, Henglin Liu, Yizhou Lin et al.

ICCV 2025arXiv:2412.18386

#15860

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

#15861

Robust Adverse Weather Removal via Spectral-based Spatial Grouping

Yuhwan Jeong, Yunseo Yang, Youngho Yoon et al.

ICCV 2025arXiv:2507.22498

#15862

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

Worameth Chinchuthakun, Tossaporn Saengja, Nontawat Tritrong et al.

ICCV 2025arXiv:2503.11054

#15863

FlowChef: Steering of Rectified Flow Models for Controlled Generations

Maitreya Patel, Song Wen, Dimitris Metaxas et al.

ICCV 2025arXiv:2507.18758

#15864

Learning Efficient and Generalizable Human Representation with Human Gaussian Model

Yifan Liu, Shengjun Zhang, Chensheng Dai et al.

#15865

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025arXiv:2508.10407

#15866

SynTag: Enhancing the Geometric Robustness of Inversion-based Generative Image Watermarking

Han Fang, Kejiang Chen, Zehua Ma et al.

#15867

Event-guided HDR Reconstruction with Diffusion Priors

Yixin Yang, jiawei zhang, Yang Zhang et al.

#15868

Fast Image Super-Resolution via Consistency Rectified Flow

Jiaqi Xu, Wenbo Li, Haoze Sun et al.

ICCV 2025highlightarXiv:2412.01794

#15869

IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Khaled Abud, Sergey Lavrushkin, Alexey Kirillov et al.

#15870

Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion

Jiwon Kim, Pureum Kim, SeonHwa Kim et al.

ICCV 2025arXiv:2508.09575

#15871

Anti-Tamper Protection for Unauthorized Individual Image Generation

Zelin Li, Ruohan Zong, Yifan Liu et al.

ICCV 2025arXiv:2508.06325

#15872

Continual Personalization for Diffusion Models

Yu-Chien Liao, Jr-Jen Chen, Chi-Pin Huang et al.

ICCV 2025arXiv:2510.02296

#15873

Split-and-Combine: Enhancing Style Augmentation for Single Domain Generalization

Zhen Zhang, Zhen Zhang, Qianlong Dang et al.

#15874

Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks

Alexander Ratzan, Sidharth Goel, Junhao Wen et al.

#15875

Zero-Shot Depth Aware Image Editing with Diffusion Models

Rishubh Parihar, Sachidanand VS, Venkatesh Babu Radhakrishnan

ICCV 2025arXiv:2506.21476

#15876

Global and Local Entailment Learning for Natural World Imagery

Srikumar Sastry, Aayush Dhakal, Eric Xing et al.

#15877

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Zhu Xu, Ting Lei, Zhimin Li et al.

ICCV 2025arXiv:2508.04943

#15878

Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Yuran Dong, Mang Ye

ICCV 2025arXiv:2507.03402

#15879

Who Controls the Authorization? Invertible Networks for Copyright Protection in Text-to-Image Synthesis

Baoyue Hu, Yang Wei, Junhao Xiao et al.

#15880

MBTI: Masked Blending Transformers with Implicit Positional Encoding for Frame-rate Agnostic Motion Estimation

Jungwoo Huh, Yeseung Park, Seongjean Kim et al.

#15881

FontAnimate: High Quality Few-shot Font Generation via Animating Font Transfer Process

Bin Fu, Zixuan Wang, Kainan Yan et al.

#15882

LDIP: Long Distance Information Propagation for Video Super-Resolution

Michael Bernasconi, Abdelaziz Djelouah, Yang Zhang et al.

#15883

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.

ICCV 2025arXiv:2410.09879

#15884

TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control

Zhenyu Yan, Jian Wang, Aoqiang Wang et al.

#15885

Continuous-Time Human Motion Field from Event Cameras

Ziyun Wang, Ruijun Zhang, Zi-Yan Liu et al.

#15886

Beyond Perspective: Neural 360-Degree Video Compression

Andy Regensky, Marc Windsheimer, Fabian Brand et al.

#15887

MCID: Multi-aspect Copyright Infringement Detection for Generated Images

Chuanwei Huang, Zexi Jia, Hongyan Fei et al.

#15888

Text2Outfit: Controllable Outfit Generation with Multimodal Language Models

Yuanhao Zhai, Yen-Liang Lin, Minxu Peng et al.

#15889

WarpHE4D: Dense 4D Head Map toward Full Head Reconstruction

Jongseob Yun, Yong-Hoon Kwon, Min-Gyu Park et al.

#15890

One-Step Specular Highlight Removal with Adapted Diffusion Models

Mahir Atmis, LEVENT KARACAN, Mehmet SARIGÜL

ICCV 2025arXiv:2507.00429

#15891

DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting

Jingyi Pan, Dan Xu, Qiong Luo

#15892

Laboring on less labors: RPCA Paradigm for Pan-sharpening

honghui xu, Chuangjie Fang, Yibin Wang et al.

ICCV 2025arXiv:2508.00085

#15893

Punching Bag vs. Punching Person: Motion Transferability in Videos

Raiyaan Abdullah, Jared Claypoole, Michael Cogswell et al.

#15894

From Linearity to Non-Linearity: How Masked Autoencoders Capture Spatial Correlations

Anthony Bisulco, Rahul Ramesh, Randall Balestriero et al.

ICCV 2025arXiv:2508.15404

#15895

Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets

Dale Decatur, Thibault Groueix, Wang Yifan et al.

ICCV 2025arXiv:2508.21032

#15896

MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks

Rui Jiao, Hanlin Wu, Wenbing Huang et al.

#15897

Cross-Granularity Online Optimization with Masked Compensated Information for Learned Image Compression

Haowei Kuang, Wenhan Yang, Zongming Guo et al.

ICCV 2025arXiv:2510.16319

#15898

Stroke2Sketch: Harnessing Stroke Attributes for Training-Free Sketch Generation

Rui Yang, Huining Li, Yiyi Long et al.

#15899

Robust Test-Time Adaptation for Single Image Denoising Using Deep Gaussian Prior

Qing Ma, Pengwei Liang, Xiong Zhou et al.

ICCV 2025arXiv:2506.23263

#15900

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

#15901

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

Minghao Fu, Guo-Hua Wang, Xiaohao Chen et al.

ICCV 2025arXiv:2507.18192

#15902

FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Minghan LI, Chenxi Xie, Yichen Wu et al.

#15903

TrackVerse: A Large-Scale Object-Centric Video Dataset for Image-Level Representation Learning

Yibing Wei, Samuel Church, Victor Suciu et al.

#15904

DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads

Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.

#15905

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

Bowen Fu, Wei Wei, Jiaqi Tang et al.

ICCV 2025arXiv:2503.13859

#15906

Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.

#15907

Drawing Developmental Trajectory from Cortical Surface Reconstruction

WENXUAN WU, ruowen qu, Zhongliang Liu et al.

#15908

Toward Better Out-painting: Improving the Image Composition with Initialization Policy Model

Xuan Han, Yihao Zhao, Yanhao Ge et al.

#15909

Blind Noisy Image Deblurring Using Residual Guidance Strategy

Heyan Liu, Jianing Sun, Jun Liu et al.

#15910

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

Runtao Liu, I Chen, Jindong Gu et al.

#15911

DiffIP: Representation Fingerprints for Robust IP Protection of Diffusion Models

Zhuoling Li, Haoxuan Qu, Jason Kuen et al.

ICCV 2025highlightarXiv:2508.10637

#15912

Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

Ryan Ramos, Vladan Stojnić, Giorgos Kordopatis-Zilos et al.

#15913

AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild

Siyoon Jin, Jisu Nam, Jiyoung Kim et al.

ICCV 2025arXiv:2601.14625

#15914

Diffusion Epistemic Uncertainty with Asymmetric Learning for Diffusion-Generated Image Detection

Yingsong Huang, Hui Guo, Jing Huang et al.

#15915

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval

Luca Collorone, Matteo Gioia, Massimiliano Pappa et al.

ICCV 2025arXiv:2510.03200

#15916

E-NeMF: Event-based Neural Motion Field for Novel Space-time View Synthesis of Dynamic Scenes

Yan Liu, Zehao Chen, Haojie Yan et al.

#15917

Multi-modal Identity Extraction

Ryan Webster, Teddy Furon

ICCV 2025arXiv:2509.08777

#15918

Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles

Eric Slyman, Mehrab Tanjim, Kushal Kafle et al.

#15919

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

Zeyi Sun, Ziyang Chu, Pan Zhang et al.

#15920

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion

Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.

#15921

Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening

Hebaixu Wang, Jiayi Ma

#15922

Open-World Skill Discovery from Unsegmented Demonstration Videos

Jingwen Deng, Zihao Wang, Shaofei Cai et al.

#15923

Instruction-based Image Editing with Planning, Reasoning, and Generation

Liya Ji, Chenyang Qi, Qifeng Chen

#15924

HDR Image Generation via Gain Map Decomposed Diffusion

Yuanshen Guan, Ruikang Xu, Yinuo Liao et al.

ICCV 2025highlightarXiv:2508.10896

#15925

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee, Kyungho Bae, Kyle Min et al.

#15926

ArtEditor: Learning Customized Instructional Image Editor from Few-Shot Examples

Shijie Huang, Yiren Song, Yuxuan Zhang et al.

ICCV 2025arXiv:2504.07949

#15927

InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians

Kefan Chen, Sergiu Oprea, Justin Theiss et al.

#15928

A3GS: Arbitrary Artistic Style into Arbitrary 3D Gaussian Splatting

Zhiyuan Fang, Rengan Xie, Xuancheng Jin et al.

NEURIPS 2025arXiv:2509.07530

#15929

Universal Few-shot Spatial Control for Diffusion Models

Kiet Nguyen, Chanhyuk Lee, Donggyun Kim et al.

#15930

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.

ICCV 2025arXiv:2508.14187

#15931

Free2Guide: Training-Free Text-to-Video Alignment using Image LVLM

Jaemin Kim, Bryan Sangwoo Kim, Jong Ye

#15932

VideoVAE+: Large Motion Video Autoencoding with Cross-modal Video VAE

Yazhou Xing, Yang Fei, Yingqing He et al.

ICCV 2025arXiv:2510.00778

#15933

DIA: The Adversarial Exposure of Deterministic Inversion in Diffusion Models

SeungHoo Hong, GeonHo Son, Juhun Lee et al.

#15934

GFPack++: Attention-Driven Gradient Fields for Optimizing 2D Irregular Packing

Tianyang Xue, Lin Lu, Yang Liu et al.

ICCV 2025arXiv:2507.03257

#15935

LACONIC: A 3D Layout Adapter for Controllable Image Creation

Léopold Maillard, Tom Durand, Adrien RAMANANA RAHARY et al.

#15936

Collaborative and Confidential Junction Trees for Hybrid Bayesian Networks

Roberto Gheda, Abele Mălan, Thiago Guzella et al.

ICCV 2025arXiv:2503.21943

#15937

Parametric Shadow Control for Portrait Generation in Text-to-Image Diffusion Models

Haoming Cai, Tsung-Wei Huang, Shiv Gehlot et al.

#15938

Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers

An Lun Liu, Yu-Wei Chao, Yi-Ting Chen

ICCV 2025arXiv:2507.11287

#15939

RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration

Longxin Kou, Fei Ni, Jianye HAO et al.

#15940

EEGMirror: Leveraging EEG data in the wild via Montage-Agnostic Self-Supervision for EEG to Video Decoding

Xuan-Hao Liu, Bao-liang Lu, Wei-Long Zheng

ICCV 2025arXiv:2503.09675

#15941

Accelerating Diffusion Sampling via Exploiting Local Transition Coherence

shangwen zhu, Han Zhang, Zhantao Yang et al.

#15942

UniversalBooth: Model-Agnostic Personalized Text-to-Image Generation

Songhua Liu, Ruonan Yu, Xinchao Wang

ICCV 2025arXiv:2508.12341

#15943

Semantic Discrepancy-aware Detector for Image Forgery Identification

Wang Ziye, Minghang Yu, Chunyan Xu et al.

#15944

What If: Understanding Motion Through Sparse Interactions

Stefan A. Baumann, Nick Stracke, Timy Phan et al.

#15945

Fractional Langevin Dynamics for Combinatorial Optimization via Polynomial-Time Escape

Shiyue Wang, Ziao Guo, Changhong Lu et al.

ICCV 2025arXiv:2503.11883

#15946

Gain-MLP: Improving HDR Gain Map Encoding via a Lightweight MLP

Trevor Canham, SaiKiran Tedla, Michael Murdoch et al.

#15947

Error Recognition in Procedural Videos using Generalized Task Graph

Shih-Po Lee, Ehsan Elhamifar

#15948

WalkVLM: Aid Visually Impaired People Walking by Vision Language Model

Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.

ICCV 2025arXiv:2510.05660

#15949

Teleportraits: Training-Free People Insertion into Any Scene

Jialu Gao, Joseph K J, Fernando De la Torre

#15950

Unified Category-Level Object Detection and Pose Estimation from RGB Images using 3D Prototypes

Tom Fischer, Xiaojie Zhang, Eddy Ilg

ICCV 2025arXiv:2508.02157

#15951

Proactive Scene Decomposition and Reconstruction

Baicheng Li, Zike Yan, Dong Wu et al.

ICCV 2025arXiv:2510.16272

#15952

PASD: A Pixel-Adaptive Swarm Dynamics Approach for Unsupervised Low-Light Image Enhancement

Shuai Jin, Yuhua Qian, Feijiang Li et al.

#15953

QK-Edit: Revisiting Attention-based Injection in MM-DiT for Image and Video Editing

Tiancheng SHEN, Jun Hao Liew, Zilong Huang et al.

#15954

Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification

Wajahat Khalid, Bin Liu, Xulin Li et al.

#15955

Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection

Yichen Lu, Siwei Nie, Minlong Lu et al.

#15956

Beyond Brain Decoding: Visual-Semantic Reconstructions to Mental Creation Extension Based on fMRI

Haodong Jing, Dongyao Jiang, Yongqiang Ma et al.

#15957

PixTalk: Controlling Photorealistic Image Processing and Editing with Language

Marcos Conde, Zihao Lu, Radu Timofte

ICCV 2025arXiv:2507.16397

#15958

ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

KA WONG, Jicheng Zhou, Haiwei Wu et al.

#15959

Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification

Wenkui Yang, Jie Cao, Junxian Duan et al.

ICCV 2025highlightarXiv:2509.13922

#15960

A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness

Xiaoyi Feng, Tao Huang, Peng Wang et al.

ICCV 2025arXiv:2503.06364

#15961

Generative Video Bi-flow

Chen Liu, Tobias Ritschel

#15962

Combinative Matching for Geometric Shape Assembly

Nahyuk Lee, Juhong Min, Junhong Lee et al.

ICCV 2025highlightarXiv:2508.09780

#15963

LayerLock: Non-collapsing Representation Learning with Progressive Freezing

Goker Erdogan, Nikhil Parthasarathy, Catalin Ionescu et al.

ICCV 2025arXiv:2509.10156

#15964

Dream-to-Recon: Monocular 3D Reconstruction with Diffusion-Depth Distillation from Single Images

Philipp Wulff, Felix Wimbauer, Dominik Muhle et al.

ICCV 2025arXiv:2508.02323

#15965

JPEG Processing Neural Operator for Backward-Compatible Coding

Woo Kyoung Han, Yongjun Lee, Byeonghun Lee et al.

ICCV 2025arXiv:2507.23521

#15966

All Parts Matter: A Unified Mask-Free Virtual Try-On Framework

Chenghu Du, Shengwu Xiong, Yi Rong

#15967

Function-centric Bayesian Network for Zero-Shot Object Goal Navigation

Sixian Zhang, Xinyao Yu, Xinhang Song et al.

ICCV 2025arXiv:2508.00746

#15968

GECO: Geometrically Consistent Embedding with Lightspeed Inference

Regine Hartwig, Dominik Muhle, Riccardo Marin et al.

#15969

Hate in Plain Sight: On the Risks of Moderating AI-Generated Hateful Illusions

Yiting Qu, Ziqing Yang, Yihan Ma et al.

ICCV 2025arXiv:2507.22617

#15970

MH-LVC: Multi-Hypothesis Temporal Prediction for Learned Conditional Residual Video Coding

Gao Zong lin, Huu-Tai Phung, Yi-Chen Yao et al.

ICCV 2025arXiv:2510.12479

#15971

Fast Non-Log-Concave Sampling under Nonconvex Equality and Inequality Constraints with Landing

Kijung Jeon, Michael Muehlebach, Molei Tao

NEURIPS 2025arXiv:2510.22044

#15972

On the Provable Importance of Gradients for Autonomous Language-Assisted Image Clustering

Bo Peng, Jie Lu, Guangquan Zhang et al.

#15973

Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation

You Huang, Lichao Chen, Jiayi Ji et al.

#15974

GenHaze: Pioneering Controllable One-Step Realistic Haze Generation for Real-World Dehazing

Sixiang Chen, Tian Ye, Yunlong Lin et al.

ICCV 2025highlightarXiv:2507.01409

#15975

CaptionSmiths: Flexibly Controlling Language Pattern in Image Captioning

Kuniaki Saito, Donghyun Kim, Kwanyong Park et al.

#15976

An Efficient Hybrid Vision Transformer for TinyML Applications

Fanhong Zeng, Huanan LI, Juntao Guan et al.

#15977

ReCoT: Reflective Self-Correction Training for Mitigating Confirmation Bias in Large Vision-Language Models

Mengxue Qu, Yibo Hu, Kunyang Han et al.

ICCV 2025arXiv:2507.17651

#15978

CNS-Bench: Benchmarking Image Classifier Robustness Under Continuous Nuisance Shifts

Olaf Dünkel, Artur Jesslen, Jiahao Xie et al.

#15979

Multi-Schema Proximity Network for Composed Image Retrieval

Jiangming Shi, Xiangbo Yin, yeyunchen yeyunchen et al.

ICCV 2025arXiv:2601.06883

#15980

MixRI: Mixing Features of Reference Images for Novel Object Pose Estimation

Xinhang Liu, Jiawei Shi, Zheng Dang et al.

#15981

Mixed-Sample SGD: an End-to-end Analysis of Supervised Transfer Learning

Yuyang Deng, Samory Kpotufe

NEURIPS 2025arXiv:2507.04194

#15982

ESCNet:Edge-Semantic Collaborative Network for Camouflaged Object Detection

Sheng Ye, Xin Chen, Yan Zhang et al.

#15983

Quasi-Self-Concordant Optimization with $\ell_{\infty}$ Lewis Weights

Alina Ene, Ta Duy Nguyen, Adrian Vladu

ICCV 2025arXiv:2508.01008

#15984

ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation

Cihang Peng, Qiming HOU, Zhong Ren et al.

#15985

Too Late to Recall: Explaining the Two-Hop Problem in Multimodal Knowledge Retrieval

Constantin Venhoff, Ashkan Khakzar, Sonia Joseph et al.

NEURIPS 2025arXiv:2512.03276

#15986

Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation

Shuo Jin, Siyue Yu, Bingfeng Zhang et al.

#15987

DiffPS: Leveraging Prior Knowledge of Diffusion Model for Person Search

Giyeol Kim, Sooyoung Yang, Jihyong Oh et al.

ICCV 2025arXiv:2507.10318

#15988

Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching

Yuhan Liu, Jingwen Fu, Yang Wu et al.

#15989

Representation Shift: Unifying Token Compression with FlashAttention

Joonmyung Choi, Sanghyeok Lee, Byungoh Ko et al.

ICCV 2025arXiv:2508.00367

#15990

ZipVL: Accelerating Vision-Language Models through Dynamic Token Sparsity

Yefei He, Feng Chen, Jing Liu et al.

ICCV 2025arXiv:2506.21835

#15991

ProSAM: Enhancing the Robustness of SAM-based Visual Reference Segmentation with Probabilistic Prompts

Xiaoqi Wang, Clint Sebastian, Wenbin He et al.

#15992

LaCoOT: Layer Collapse through Optimal Transport

Victor Quétu, Zhu LIAO, Nour Hezbri et al.

ICCV 2025arXiv:2406.08933

#15993

Zero-Shot Compositional Video Learning with Coding Rate Reduction

Heeseok Jung, Jun-Hyeon Bak, Yujin Jeong et al.

#15994

Fuzzy Contrastive Decoding to Alleviate Object Hallucination in Large Vision-Language Models

Jieun Kim, Jinmyeong Kim, Yoonji Kim et al.

#15995

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

Xin Shen, Xinyu Wang, Lei Shen et al.

ICCV 2025arXiv:2503.17071

#15996

Superpowering Open-Vocabulary Object Detectors for X-ray Vision

Pablo Garcia-Fernandez, Lorenzo Vaquero, Mingxuan Liu et al.

#15997

RhythmGuassian: Repurposing Generalizable Gaussian Model For Remote Physiological Measurement

Hao LU, Yuting Zhang, Jiaqi Tang et al.

#15998

On the Recovery of Cameras from Fundamental Matrices

Rakshith Madhavan, Federica Arrigoni