Most Cited ICCV "microtransactions" Papers

2,701 papers found • Page 13 of 14

#2401

Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation

Shengfang ZHAI, Jiajun Li, Yue Liu et al.

ICCV 2025highlightarXiv:2503.06453
#2402

Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning

Liwei Luo, Shuaitengyuan Li, Dongwei Ren et al.

ICCV 2025posterarXiv:2511.03245
#2403

ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation

Qizhen Lan, Qing Tian

ICCV 2025posterarXiv:2503.06307
#2404

GReg: Geometry-Aware Region Refinement for Sign Language Video Generation

Tongkai Shi, Lianyu Hu, Fanhua Shang et al.

ICCV 2025poster
#2405

Unsupervised Part Discovery via Descriptor-Based Masked Image Restoration with Optimized Constraints

Jiahao Xia, Yike Wu, Wenjian Huang et al.

ICCV 2025posterarXiv:2507.11985
#2406

NETracer: A Topology-Aware Iterative Tracing Approach for Tubular Structure Extraction

Chao Liu, Yangbo Jiang, Nenggan Zheng

ICCV 2025poster
#2407

MotionCtrl: A Real-time Controllable Vision-Language-Motion Model

Bin Cao, Sipeng Zheng, Ye Wang et al.

ICCV 2025poster
#2408

UIPro: Unleashing Superior Interaction Capability For GUI Agents

Hongxin Li, Jingran Su, Jingfan CHEN et al.

ICCV 2025posterarXiv:2509.17328
#2409

SALAD -- Semantics-Aware Logical Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ICCV 2025posterarXiv:2509.02101
#2410

FineMotion: A Dataset and Benchmark with both Spatial and Temporal Annotation for Fine-grained Motion Generation and Editing

Bizhu Wu, Jinheng Xie, Meidan Ding et al.

ICCV 2025posterarXiv:2507.19850
#2411

VLR-Driver: Large Vision-Language-Reasoning Models for Embodied Autonomous Driving

Fanjie Kong, Yitong Li, Weihuang Chen et al.

ICCV 2025poster
#2412

Vid-Group: Temporal Video Grounding Pretraining from Unlabeled Videos in the Wild

Peijun Bao, Chenqi Kong, SIYUAN YANG et al.

ICCV 2025poster
#2413

Knowledge Transfer from Interaction Learning

Yilin Gao, Kangyi Chen, Zhongxing Peng et al.

ICCV 2025posterarXiv:2509.18733
#2414

WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction

Richard Liu, Daniel Fu, Noah Tan et al.

ICCV 2025posterarXiv:2505.04813
#2415

Temperature in Cosine-based Softmax Loss

Takumi Kobayashi

ICCV 2025poster
#2416

Multi-modal Segment Anything Model for Camouflaged Scene Segmentation

Guangyu Ren, Hengyan Liu, Michalis Lazarou et al.

ICCV 2025poster
#2417

Synthesizing Near-Boundary OOD Samples for Out-of-Distribution Detection

Jinglun Li, Kaixun Jiang, Zhaoyu Chen et al.

ICCV 2025highlightarXiv:2507.10225
#2418

Cassic: Towards Content-Adaptive State-Space Models for Learned Image Compression

Shiyu Qin, Jinpeng Wang, Yimin Zhou et al.

ICCV 2025poster
#2419

SpectralAR: Spectral Autoregressive Visual Generation

Yuanhui Huang, Weiliang Chen, Wenzhao Zheng et al.

ICCV 2025posterarXiv:2506.10962
#2420

Boosting Adversarial Transferability via Negative Hessian Trace Regularization

Yunfei Long, Zilin Tian, Liguo Zhang et al.

ICCV 2025poster
#2421

AcZeroTS: Active Learning for Zero-shot Tissue Segmentation in Pathology Images

Jiao Tang, Junjie Zhou, Bo Qian et al.

ICCV 2025poster
#2422

OneGT: One-Shot Geometry-Texture Neural Rendering for Head Avatars

Jinshu Chen, Bingchuan Li, Fan Zhang et al.

ICCV 2025poster
#2423

Unsupervised Visible-Infrared Person Re-identification under Unpaired Settings

Haoyu Yao, Bin Yang, Wenke Huang et al.

ICCV 2025poster
#2424

Adaptive Prompt Learning via Gaussian Outlier Synthesis for Out-of-distribution Detection

Yongkang Zhang, Dongyu She, Zhong Zhou

ICCV 2025poster
#2425

Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions

ZiYi Dong, Chengxing Zhou, Weijian Deng et al.

ICCV 2025posterarXiv:2504.21292
#2426

Ultra-Precision 6DoF Pose Estimation Using 2-D Interpolated Discrete Fourier Transform

Guowei Shi, Zian Mao, Peisen Huang

ICCV 2025poster
#2427

A Differentiable Wave Optics Model for End-to-End Computational Imaging System Optimization

Chi-Jui Ho, Yash Belhe, Steve Rotenberg et al.

ICCV 2025posterarXiv:2412.09774
#2428

AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation

Haifeng Zhong, Fan Tang, Zhuo Chen et al.

ICCV 2025poster
#2429

OCK: Unsupervised Dynamic Video Prediction with Object-Centric Kinematics

YeonJi Song, Jaein Kim, Suhyung Choi et al.

ICCV 2025posterarXiv:2404.18423
#2430

Prompt Guidance and Human Proximal Perception for HOT Prediction with Regional Joint Loss

Yuxiao Wang, Yu Lei, Zhenao WEI et al.

ICCV 2025posterarXiv:2507.01630
#2431

Coupling the Generator with Teacher for Effective Data-Free Knowledge Distillation

Xu Chen, Yang Li, Yahong Han et al.

ICCV 2025poster
#2432

Towards a Universal Image Degradation Model via Content-Degradation Disentanglement

Wenbo Yang, Zhongling Wang, Zhou Wang

ICCV 2025posterarXiv:2505.12860
#2433

Intra-view and Inter-view Correlation Guided Multi-view Novel Class Discovery

Xinhang Wan, Jiyuan Liu, Qian Qu et al.

ICCV 2025posterarXiv:2507.12029
#2434

HUST: High-Fidelity Unbiased Skin Tone Estimation via Texture Quantization

Zimin Ran, Xingyu Ren, Xiang An et al.

ICCV 2025poster
#2435

Know Your Attention Maps: Class-specific Token Masking for Weakly Supervised Semantic Segmentation

Joëlle Hanna, Damian Borth

ICCV 2025posterarXiv:2507.06848
#2436

Structure-Guided Diffusion Models for High-Fidelity Portrait Shadow Removal

wanchang Yu, Qing Zhang, Rongjia Zheng et al.

ICCV 2025posterarXiv:2507.04692
#2437

FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment

Hang Xu, Jie Huang, Linjiang Huang et al.

ICCV 2025posterarXiv:2506.22509
#2438

ProbMED: A Probabilistic Framework for Medical Multimodal Binding

Yuan Gao, Sangwook Kim, Jianzhong You et al.

ICCV 2025posterarXiv:2509.25711
#2439

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

Yingyu Liang, Zhizhou Sha, Zhenmei Shi et al.

ICCV 2025posterarXiv:2405.16418
#2440

FDPT: Federated Discrete Prompt Tuning for Black-Box Visual-Language Models

Jiaqi Wu, Simin Chen, Jing Tang et al.

ICCV 2025poster
#2441

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

Duo Wu, Jinghe Wang, Yuan Meng et al.

ICCV 2025posterarXiv:2411.16313
#2442

Dynamic Group Detection using VLM-augmented Temporal Groupness Graph

Kaname Yokoyama, Chihiro Nakatani, Norimichi Ukita

ICCV 2025posterarXiv:2509.04758
#2443

A Tiny Change, A Giant Leap: Long-Tailed Class-Incremental Learning via Geometric Prototype Alignment

xinyi lai, Luojun Lin, Weijie Chen et al.

ICCV 2025poster
#2444

CountSE: Soft Exemplar Open-set Object Counting

Shuai Liu, Peng Zhang, Shiwei Zhang et al.

ICCV 2025highlight
#2445

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178
#2446

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices

Xudong LU, Yinghao Chen, Renshou Wu et al.

ICCV 2025posterarXiv:2503.06019
#2447

MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation

Xinyu Liu, Guolei Sun, Cheng Wang et al.

ICCV 2025posterarXiv:2509.21265
#2448

Top2Pano: Learning to Generate Indoor Panoramas from Top-Down View

Zitong Zhang, Suranjan Gautam, Rui Yu

ICCV 2025posterarXiv:2507.21371
#2449

MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

Yaopeng Lou, Liao Shen, Tianqi Liu et al.

ICCV 2025posterarXiv:2508.04297
#2450

Region-Level Data Attribution for Text-to-Image Generative Models

Trong Bang Nguyen, Phi Le Nguyen, Simon Lucey et al.

ICCV 2025poster
#2451

Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting

Yuekun Dai, Haitian Li, Shangchen Zhou et al.

ICCV 2025posterarXiv:2508.01098
#2452

Generalization-Preserved Learning: Closing the Backdoor to Catastrophic Forgetting in Continual Deepfake Detection

Xueyi Zhang, Peiyin Zhu, Chengwei Zhang et al.

ICCV 2025poster
#2453

LangBridge: Interpreting Image as a Combination of Language Embeddings

Jiaqi Liao, Yuwei Niu, Fanqing Meng et al.

ICCV 2025posterarXiv:2503.19404
#2454

IGD: Instructional Graphic Design with Multimodal Layer Generation

Yadong Qu, Shancheng Fang, Yuxin Wang et al.

ICCV 2025posterarXiv:2507.09910
#2455

Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection

Romain Thoreau, Valerio Marsocci, Dawa Derksen

ICCV 2025posterarXiv:2503.09493
#2456

CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction

Yuanyuan Gao, Hao Li, Jiaqi Chen et al.

ICCV 2025posterarXiv:2503.23044
#2457

AIRA: Activation-Informed Low-Rank Adaptation for Large Models

Lujun Li, Dezhi Li, Cheng Lin et al.

ICCV 2025poster
#2458

Embodied Navigation with Auxiliary Task of Action Description Prediction

Haru Kondoh, Asako Kanezaki

ICCV 2025posterarXiv:2510.21809
#2459

Face Retouching with Diffusion Data Generation and Spectral Restorement

Zhidan Xu, Xiaoqin Zhang, Shijian Lu

ICCV 2025poster
#2460

Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder

Wonwoong Cho, Yan-Ying Chen, Matthew Klenk et al.

ICCV 2025highlightarXiv:2503.11937
#2461

Neural Solver of Dichromatic Reflection Model for Specular Highlight Removal

Gang Fu

ICCV 2025poster
#2462

Wavelet Policy: Lifting Scheme for Policy Learning in Long-Horizon Tasks

Hao Huang, Shuaihang Yuan, Geeta Chandra Raju Bethala et al.

ICCV 2025posterarXiv:2507.04331
#2463

Contrastive Flow Matching

George Stoica, Vivek Ramanujan, Xiang Fan et al.

ICCV 2025posterarXiv:2506.05350
#2464

Class Token as Proxy: Optimal Transport-assisted Proxy Learning for Weakly Supervised Semantic Segmentation

Jian Wang, Tianhong Dai, Bingfeng Zhang et al.

ICCV 2025poster
#2465

HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation

Qinqian Lei, Bo Wang, Robby Tan

ICCV 2025posterarXiv:2507.15542
#2466

AllGCD: Leveraging All Unlabeled Data for Generalized Category Discovery

Xinzi Cao, Ke Chen, Feidiao Yang et al.

ICCV 2025poster
#2467

Towards Long-Horizon Vision-Language-Action System: Reasoning, Acting and Memory

Daixun Li, Yusi Zhang, Mingxiang Cao et al.

ICCV 2025poster
#2468

UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Dayong Su, Yafei Zhang, Huafeng Li et al.

ICCV 2025posterarXiv:2506.22736
#2469

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Lukas Höllein, Aljaz Bozic, Michael Zollhöfer et al.

ICCV 2025posterarXiv:2409.12892
#2470

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

Xiao Chen, Tai Wang, Quanyi Li et al.

ICCV 2025posterarXiv:2505.20294
#2471

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

Zhixiang Guo, Siyuan Liang, Aishan Liu et al.

ICCV 2025posterarXiv:2412.01528
#2472

CA2C: A Prior-Knowledge-Free Approach for Robust Label Noise Learning via Asymmetric Co-learning and Co-training

Mengmeng Sheng, Zeren Sun, Tianfei Zhou et al.

ICCV 2025poster
#2473

Learnable Logit Adjustment for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch

lee hyuck, Taemin Park, Heeyoung Kim

ICCV 2025poster
#2474

CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

Han Ji, Yuqi Feng, Jiahao Fan et al.

ICCV 2025posterarXiv:2506.04001
#2475

TCFG: Truncated Classifier-Free Guidance for Efficient and Scalable Text-to-Image Acceleration

Xiaomeng Fu, Jia Li

ICCV 2025poster
#2476

Point Cloud Self-supervised Learning via 3D to Multi-view Masked Learner

Zhimin Chen, Xuewei Chen, Xiao Guo et al.

ICCV 2025posterarXiv:2311.10887
#2477

MSA2: Multi-task Framework with Structure-aware and Style-adaptive Character Representation for Open-set Chinese Text Recognition

Yangfu Li, Hongjian Zhan, Qi Liu et al.

ICCV 2025poster
#2478

DiffPCI: Large Motion Point Cloud frame Interpolation with Diffusion Model

tianyu zhang, Haobo Jiang, jian Yang et al.

ICCV 2025poster
#2479

MultiModal Action Conditioned Video Simulation

Yichen Li, Antonio Torralba

ICCV 2025poster
#2480

Local Dense Logit Relations for Enhanced Knowledge Distillation

Liuchi Xu, Kang Liu, Jinshuai Liu et al.

ICCV 2025posterarXiv:2507.15911
#2481

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Hao Chen, Shell Xu Hu, Wayne Luk et al.

ICCV 2025posterarXiv:2503.12649
#2482

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025posterarXiv:2503.12955
#2483

Soft Local Completeness: Rethinking Completeness in XAI

Ziv Weiss Haddad, Oren Barkan, Yehonatan Elisha et al.

ICCV 2025poster
#2484

ClearSight: Human Vision-Inspired Solutions for Event-Based Motion Deblurring

Xiaopeng LIN, Yulong Huang, Hongwei Ren et al.

ICCV 2025posterarXiv:2501.15808
#2485

PBFG: A New Physically-Based Dataset and Removal of Lens Flares and Glares

Jie Zhu, Sungkil Lee

ICCV 2025poster
#2486

Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild

Haoran Wang, Zekun Li, Jian Zhang et al.

ICCV 2025posterarXiv:2508.07759
#2487

An Information-Theoretic Regularizer for Lossy Neural Image Compression

ZHANG YINGWEN, Meng Wang, Xihua Sheng et al.

ICCV 2025posterarXiv:2411.16727
#2488

Knowledge-Guided Part Segmentation

Xuejian Gou, Fang Liu, Licheng Jiao et al.

ICCV 2025poster
#2489

Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation

Yooshin Cho, Hanbyel Cho, Janghyeon Lee et al.

ICCV 2025posterarXiv:2507.20284
#2490

KV-Edit: Training-Free Image Editing for Precise Background Preservation

Tianrui Zhu, Shiyi Zhang, Jiawei Shao et al.

ICCV 2025posterarXiv:2502.17363
#2491

FusionPhys: A Flexible Framework for Fusing Complementary Sensing Modalities in Remote Physiological Measurement

Chenhang Ying, Huiyu Yang, Jieyi Ge et al.

ICCV 2025poster
#2492

DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations

Xiaohui Li, Yihao Liu, Shuo Cao et al.

ICCV 2025posterarXiv:2501.10110
#2493

Power of Cooperative Supervision: Multiple Teachers Framework for Advanced 3D Semi-Supervised Object Detection

Jin-Hee Lee, Jae-keun Lee, Jeseok Kim et al.

ICCV 2025poster
#2494

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025posterarXiv:2504.21414
#2495

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching

Yuxuan Yuan, Luyao Tang, Chaoqi Chen et al.

ICCV 2025poster
#2496

DADet: Safeguarding Image Conditional Diffusion Models against Adversarial and Backdoor Attacks via Diffusion Anomaly Detection

Hongwei Yu, Xinlong Ding, Jiawei Li et al.

ICCV 2025highlight
#2497

LEGO-Maker: A Semantic-Driven Algorithm for Text-to-3D Generation

Yifei Zhang, Lei Chen

ICCV 2025poster
#2498

COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion

Zekun Qian, Ruize Han, Zhixiang Wang et al.

ICCV 2025poster
#2499

Dense Policy: Bidirectional Autoregressive Learning of Actions

Yue Su, Xinyu Zhan, Hongjie Fang et al.

ICCV 2025posterarXiv:2503.13217
#2500

monoVLN: Bridging the Observation Gap between Monocular and Panoramic Vision and Language Navigation

Ren-Jie Lu, Yu Zhou, hao cheng et al.

ICCV 2025poster
#2501

DOGR: Towards Versatile Visual Document Grounding and Referring

Yinan Zhou, Yuxin Chen, Haokun Lin et al.

ICCV 2025posterarXiv:2411.17125
#2502

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Xiwei Xuan, Ziquan Deng, Kwan-Liu Ma

ICCV 2025highlightarXiv:2506.21233
#2503

MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos

Hongyi Zhou, Xiaogang Wang, Yulan Guo et al.

ICCV 2025posterarXiv:2505.11868
#2504

Performing Defocus Deblurring by Modeling its Formation Process

Zhengbo Zhang, Lin Geng Foo, Hossein Rahmani et al.

ICCV 2025poster
#2505

CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance

Peiqi Chen, Lei Yu, Yi Wan et al.

ICCV 2025highlightarXiv:2507.17312
#2506

Supervised Exploratory Learning for Long-Tailed Visual Recognition

Zhongquan Jian, Yanhao Chen, Wangyancheng Wangyancheng et al.

ICCV 2025poster
#2507

MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance

Zihan Cao, Yu Zhong, Ziqi Wang et al.

ICCV 2025posterarXiv:2503.14944
#2508

Blind Video Super-Resolution based on Implicit Kernels

Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.

ICCV 2025posterarXiv:2503.07856
#2509

OmniDiff: A Comprehensive Benchmark for Fine-grained Image Difference Captioning

Yuan Liu, Saihui Hou, Saijie Hou et al.

ICCV 2025posterarXiv:2503.11093
#2510

Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts

Chiao-An Yang, Kuan-Chuan Peng, Raymond A. Yeh

ICCV 2025posterarXiv:2507.16946
#2511

More Reliable Pseudo-labels, Better Performance: A Generalized Approach to Single Positive Multi-label Learning

Luong Tran, Thieu Vo, Anh Nguyen et al.

ICCV 2025posterarXiv:2508.20381
#2512

TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding

Zuhao Yang, Yingchen Yu, Yunqing Zhao et al.

ICCV 2025posterarXiv:2508.01699
#2513

DCHM: Depth-Consistent Human Modeling for Multiview Detection

Jiahao Ma, Tianyu Wang, Miaomiao Liu et al.

ICCV 2025posterarXiv:2507.14505
#2514

Adversarial Robustness of Discriminative Self-Supervised Learning in Vision

Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy

ICCV 2025posterarXiv:2503.06361
#2515

HPSv3: Towards Wide-Spectrum Human Preference Score

Yuhang Ma, Keqiang Sun, Xiaoshi Wu et al.

ICCV 2025posterarXiv:2508.03789
#2516

Active Perception Meets Rule-Guided RL: A Two-Phase Approach for Precise Object Navigation in Complex Environments

Liang Qin, Min Wang, Peiwei Li et al.

ICCV 2025poster
#2517

UNIS: A Unified Framework for Achieving Unbiased Neural Implicit Surfaces in Volume Rendering

Junkai Deng, Hanting Niu, Jiaze Li et al.

ICCV 2025poster
#2518

IntrinsicControlNet: Cross-distribution Image Generation with Real and Unreal

Jiayuan Lu, Rengan Xie, Zixuan Xie et al.

ICCV 2025poster
#2519

Loss Functions for Predictor-based Neural Architecture Search

Han Ji, Yuqi Feng, Jiahao Fan et al.

ICCV 2025posterarXiv:2506.05869
#2520

Advancing Text-to-3D Generation with Linearized Lookahead Variational Score Distillation

Yu Lei, Bingde Liu, Qingsong Xie et al.

ICCV 2025posterarXiv:2507.09748
#2521

Steering Guidance for Personalized Text-to-Image Diffusion Models

Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.

ICCV 2025posterarXiv:2508.00319
#2522

ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models

Zifu Wan, Ce Zhang, Silong Yong et al.

ICCV 2025posterarXiv:2507.00898
#2523

Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds

Pei He, Lingling Li, Licheng Jiao et al.

ICCV 2025posterarXiv:2508.11265
#2524

GaussianReg: Rapid 2D/3D Registration for Emergency Surgery via Explicit 3D Modeling with Gaussian Primitives

Weihao Yu, Xiaoqing Guo, Xinyu Liu et al.

ICCV 2025poster
#2525

ArgoTweak: Towards Self-Updating HD Maps through Structured Priors

Lena Wild, Rafael Valencia, Patric Jensfelt

ICCV 2025posterarXiv:2509.08764
#2526

Event-aided Dense and Continuous Point Tracking: Everywhere and Anytime

Zhexiong Wan, Jianqin Luo, Yuchao Dai et al.

ICCV 2025poster
#2527

Context-Aware Academic Emotion Dataset and Benchmark

Luming Zhao, Jingwen Xuan, Jiamin Lou et al.

ICCV 2025posterarXiv:2507.00586
#2528

FlowSeek: Optical Flow Made Easier with Depth Foundation Models and Motion Bases

Matteo Poggi, Fabio Tosi

ICCV 2025posterarXiv:2509.05297
#2529

TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging

QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang

ICCV 2025highlight
#2530

SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations

Songchun Zhang, Huiyao Xu, Sitong Guo et al.

ICCV 2025posterarXiv:2505.11992
#2531

Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration

Sitao Zhang, Hongda Mao, Qingshuang Chen et al.

ICCV 2025poster
#2532

COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets

Lingyu Chen, Yawen Zeng, Yue Wang et al.

ICCV 2025posterarXiv:2508.09886
#2533

NATRA: Noise-Agnostic Framework for Trajectory Prediction with Noisy Observations

Rongqing Li, Changsheng Li, Ruilin Lv et al.

ICCV 2025poster
#2534

MS3D: High-Quality 3D Generation via Multi-Scale Representation Modeling

Guan Luo, Jianfeng Zhang

ICCV 2025poster
#2535

UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation

Zhengyin Liang, Hui Yin, Min Liang et al.

ICCV 2025highlight
#2536

Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics

Keming Wu, Junwen Chen, Zhanhao Liang et al.

ICCV 2025poster
#2537

PLAN: Proactive Low-Rank Allocation for Continual Learning

XIEQUN WANG, Zhan Zhuang, Yu Zhang

ICCV 2025posterarXiv:2510.21188
#2538

Leveraging Spatial Invariance to Boost Adversarial Transferability

Zihan Zhou, LI LI, Yanli Ren et al.

ICCV 2025poster
#2539

FedXDS: Leveraging Model Attribution Methods to counteract Data Heterogeneity in Federated Learning

Maximilian Hoefler, Karsten Mueller, Wojciech Samek

ICCV 2025poster
#2540

Visual Textualization for Image Prompted Object Detection

Yongjian Wu, Yang Zhou, Jiya Saiyin et al.

ICCV 2025posterarXiv:2506.23785
#2541

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Johannes Jakubik, Felix Yang, Benedikt Blumenstiel et al.

ICCV 2025posterarXiv:2504.11171
#2542

LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs

Haoran Lou, Chunxiao Fan, Ziyan Liu et al.

ICCV 2025posterarXiv:2507.00505
#2543

Transformer-based Tooth Alignment Prediction with Occlusion and Collision Constraints

DongZhenXing DongZhenXing, Jiazhou Chen

ICCV 2025posterarXiv:2410.20806
#2544

SD2Actor: Continuous State Decomposition via Diffusion Embeddings for Robotic Manipulation

lijiayi jiayi

ICCV 2025poster
#2545

Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis

Xinyu Hou, Zongsheng Yue, Xiaoming Li et al.

ICCV 2025posterarXiv:2411.17769
#2546

Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification

Guibao SHEN, Luozhou Wang, Jiantao Lin et al.

ICCV 2025poster
#2547

ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection

Hongchi Ma, Guanglei Yang, Debin Zhao et al.

ICCV 2025poster
#2548

GMMamba: Group Masking Mamba for Whole Slide Image Classification

Tingting Zheng, Hongxun Yao, Kui Jiang et al.

ICCV 2025poster
#2549

TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction

Dadong Jiang, Zhi Hou, Zhihui Ke et al.

ICCV 2025posterarXiv:2411.11941
#2550

RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection

Jianfang He, Min Cao, Silong Peng et al.

ICCV 2025poster
#2551

Temporal Rate Reduction Clustering for Human Motion Segmentation

Xianghan Meng, Zhengyu Tong, Zhiyuan Huang et al.

ICCV 2025posterarXiv:2506.21249
#2552

Hierarchy UGP: Hierarchy Unified Gaussian Primitive for Large-Scale Dynamic Scene Reconstruction

Hongyang Sun, Qinglin Yang, Jiawei Wang et al.

ICCV 2025poster
#2553

Backdoor Mitigation by Distance-Driven Detoxification

Shaokui Wei, Jiayin Liu, Hongyuan Zha

ICCV 2025highlightarXiv:2411.09585
#2554

Democratizing High-Fidelity Co-Speech Gesture Video Generation

Xu Yang, Shaoli Huang, Shenbo Xie et al.

ICCV 2025posterarXiv:2507.06812
#2555

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI

Fangwei Zhong, Kui Wu, Churan Wang et al.

ICCV 2025highlightarXiv:2412.20977
#2556

HFD-Teacher: High-Frequency Depth Distillation from Depth Foundation Models for Enhanced Depth Completion

Zhiyuan Yang, Anqi Cheng, Haiyue Zhu et al.

ICCV 2025poster
#2557

Separation for Better Integration: Disentangling Edge and Motion in Event-based Deblurring

Yufei Zhu, Hao Chen, Yongjian Deng et al.

ICCV 2025poster
#2558

Diversity-Enhanced Distribution Alignment for Dataset Distillation

Hongcheng Li, Yucan Zhou, Xiaoyan Gu et al.

ICCV 2025poster
#2559

Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection

Hanshi Wang, Jin Gao, Weiming Hu et al.

ICCV 2025highlightarXiv:2507.04369
#2560

SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking

Sixian Chan, Zedong Li, Xiaoqin Zhang et al.

ICCV 2025highlight
#2561

Two Losses, One Goal: Balancing Conflict Gradients for Semi-supervised Semantic Segmentation

Rui Sun, Huayu Mai, Wangkai Li et al.

ICCV 2025highlight
#2562

Region-based Cluster Discrimination for Visual Representation Learning

Yin Xie, Kaicheng Yang, Xiang An et al.

ICCV 2025highlightarXiv:2507.20025
#2563

CMB-ML: A Cosmic Microwave Background Dataset for the Oldest Possible Computer Vision Task

James Amato, Yunan Xie, Leonel Medina-Varela et al.

ICCV 2025poster
#2564

Adapt Foundational Segmentation Models with Heterogeneous Searching Space

Li Yi, Jie Hu, Songan Zhang et al.

ICCV 2025poster
#2565

Think Twice: Test-Time Reasoning for Robust CLIP Zero-Shot Classification

Shenyu Lu, Zhaoying Pan, Xiaoqian Wang

ICCV 2025poster
#2566

Shape of Motion: 4D Reconstruction from a Single Video

Qianqian Wang, Vickie Ye, Hang Gao et al.

ICCV 2025highlightarXiv:2407.13764
#2567

EditCLIP: Representation Learning for Image Editing

Qian Wang, Aleksandar Cvejic, Abdelrahman Eldesokey et al.

ICCV 2025posterarXiv:2503.20318
#2568

Counting Stacked Objects

Corentin Dumery, Noa Ette, Aoxiang Fan et al.

ICCV 2025posterarXiv:2411.19149
#2569

Allowing Oscillation Quantization: Overcoming Solution Space Limitation in Low Bit-Width Quantization

Weiying Xie, Zihan Meng, Jitao Ma et al.

ICCV 2025poster
#2570

MOVE: Motion-Guided Few-Shot Video Object Segmentation

Kaining Ying, Hengrui Hu, Henghui Ding

ICCV 2025posterarXiv:2507.22061
#2571

CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation

Dengke Zhang, Fagui Liu, Quan Tang

ICCV 2025posterarXiv:2411.10086
#2572

mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework

Bingyi Liu, Jian Teng, Hongfei Xue et al.

ICCV 2025posterarXiv:2501.12263
#2573

FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers

Junjie Zhang, Haisheng Su, Feixiang Song et al.

ICCV 2025posterarXiv:2510.15385
#2574

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

Pinxin Liu, Luchuan Song, Junhua Huang et al.

ICCV 2025posterarXiv:2501.18898
#2575

SDFormer: Vision-based 3D Semantic Scene Completion via SAM-assisted Dual-channel Voxel Transformer

Yujie Xue, Huilong Pi, Jiapeng Zhang et al.

ICCV 2025poster
#2576

TopoTTA: Topology-Enhanced Test-Time Adaptation for Tubular Structure Segmentation

Jiale Zhou, Wenhan Wang, Shikun Li et al.

ICCV 2025posterarXiv:2508.00442
#2577

RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control

Teng Li, Guangcong Zheng, Rui Jiang et al.

ICCV 2025posterarXiv:2502.10059
#2578

MagShield: Towards Better Robustness in Sparse Inertial Motion Capture Under Magnetic Disturbances

Yunzhe Shao, Xinyu Yi, Lu Yin et al.

ICCV 2025posterarXiv:2506.22907
#2579

DeFSS: Image-to-Mask Denoising Learning for Few-shot Segmentation

Zishu Qin, Junhao Xu, Weifeng Ge

ICCV 2025poster
#2580

TAD-E2E: A Large-scale End-to-end Autonomous Driving Dataset

Chang Liu, mingxuzhu mingxuzhu, Zheyuan Zhang et al.

ICCV 2025poster
#2581

VAGUE: Visual Contexts Clarify Ambiguous Expressions

Heejeong Nam, Jinwoo Ahn, Keummin Ka et al.

ICCV 2025posterarXiv:2411.14137
#2582

Photolithography Overlay Map Generation with Implicit Knowledge Distillation Diffusion Transformer

YuanFu Yang, Hsiu-Hui Hsiao

ICCV 2025poster
#2583

What's Making That Sound Right Now? Video-centric Audio-Visual Localization

hahyeon choi, Junhoo Lee, Nojun Kwak

ICCV 2025posterarXiv:2507.04667
#2584

VehicleMAE: View-asymmetry Mutual Learning for Vehicle Re-identification Pre-training via Masked AutoEncoders

Qi Wang, Zeyu Zhang, Dong Wang et al.

ICCV 2025poster
#2585

MagicCity: Geometry-Aware 3D City Generation from Satellite Imagery with Multi-View Consistency

Xingbo YAO, xuanmin Wang, Hao WU et al.

ICCV 2025poster
#2586

RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning

Chengyu Zheng, Honghua Chen, Jin Huang et al.

ICCV 2025posterarXiv:2507.19950
#2587

Multi-scenario Overlapping Text Segmentation with Depth Awareness

Yang Liu, Xudong Xie, Yuliang Liu et al.

ICCV 2025poster
#2588

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Kaiyu Yue, Vasu Singla, Menglin Jia et al.

ICCV 2025posterarXiv:2505.22664
#2589

OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection

Adrian Chow, Evelien Riddell, Yimu Wang et al.

ICCV 2025posterarXiv:2503.06435
#2590

FullDiT: Video Generative Foundation Models with Multimodal Control via Full Attention

Xuan Ju, Weicai Ye, Quande Liu et al.

ICCV 2025poster
#2591

SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection

Chaesong Park, Eunbin Seo, JihyeonHwang JihyeonHwang et al.

ICCV 2025posterarXiv:2508.10411
#2592

Exploring the Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

ICCV 2025posterarXiv:2505.15755
#2593

ConceptSplit: Decoupled Multi-Concept Personalization of Diffusion Models via Token-wise Adaptation and Attention Disentanglement

Habin Lim, Youngseob Won, Juwon Seo et al.

ICCV 2025posterarXiv:2510.04668
#2594

Backdoor Defense via Enhanced Splitting and Trap Isolation

Hongrui Yu, Lu Qi, Wanyu Lin et al.

ICCV 2025poster
#2595

Learning Hierarchical Line Buffer for Image Processing

Jiacheng Li, Feiran Li, Daisuke Iso

ICCV 2025poster
#2596

ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction

Soonwoo Cha, Jiwoo Song, Juan Yeo et al.

ICCV 2025posterarXiv:2506.08678
#2597

Humans as Checkerboards: Calibrating Camera Motion Scale for World-Coordinate Human Mesh Recovery

Fengyuan Yang, Kerui Gu, Ha Linh Nguyen et al.

ICCV 2025posterarXiv:2407.00574
#2598

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

Chende Zheng, Ruiqi suo, Chenhao Lin et al.

ICCV 2025posterarXiv:2508.00701
#2599

Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering

Feifei Zhang, Zhihao Wang, Xi Zhang et al.

ICCV 2025poster
#2600

GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion

Gwanghyun Kim, Xueting Li, Ye Yuan et al.

ICCV 2025posterarXiv:2505.23085