Most Cited ICCV "text-to-image alignment" Papers

2,701 papers found • Page 13 of 14

#2401

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Xiaokun Feng, Shiyu Hu, Xuchen Li et al.

ICCV 2025highlightarXiv:2507.19875
#2402

Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling

Zenghao Niu, Weicheng Xie, Siyang Song et al.

ICCV 2025posterarXiv:2511.00411
#2403

CWNet: Causal Wavelet Network for Low-Light Image Enhancement

Tongshun Zhang, Pingping Liu, Yubing Lu et al.

ICCV 2025posterarXiv:2507.10689
#2404

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Huaiyuan Qin et al.

ICCV 2025highlightarXiv:2507.12857
#2405

Federated Representation Angle Learning

Liping Yi, Han Yu, Gang Wang et al.

ICCV 2025poster
#2406

GeoDistill: Geometry-Guided Self-Distillation for Weakly Supervised Cross-View Localization

Shaowen Tong, Zimin Xia, Alexandre Alahi et al.

ICCV 2025posterarXiv:2507.10935
#2407

Diffusion-based Source-biased Model for Single Domain Generalized Object Detection

Han Jiang, Wenfei Yang, Tianzhu Zhang et al.

ICCV 2025poster
#2408

Measuring the Impact of Rotation Equivariance on Aerial Object Detection

Xiuyu Wu, Xinhao Wang, Xiubin Zhu et al.

ICCV 2025posterarXiv:2507.09896
#2409

Flow Stochastic Segmentation Networks

Fabio De Sousa Ribeiro, Omar Todd, Charles Jones et al.

ICCV 2025posterarXiv:2507.18838
#2410

From Gaze to Movement: Predicting Visual Attention for Autonomous Driving Human-Machine Interaction based on Programmatic Imitation Learning

Yexin Huang, Yongbin Lin, Lishengsa Yue et al.

ICCV 2025poster
#2411

ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Predictions

Dubing Chen, Jin Fang, Wencheng Han et al.

ICCV 2025posterarXiv:2411.07725
#2412

From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning

Sen Wang, Shao Zeng, Tianjun Gu et al.

ICCV 2025posterarXiv:2507.08380
#2413

G2D: Boosting Multimodal Learning with Gradient-Guided Distillation

Mohammed Rakib, Arunkumar Bagavathi

ICCV 2025poster
#2414

Unified Video Generation via Next-Set Prediction in Continuous Domain

Zhanzhou Feng, Qingpei Guo, Xinyu Xiao et al.

ICCV 2025poster
#2415

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Feihong Yan, qingyan wei, Jiayi Tang et al.

ICCV 2025posterarXiv:2503.12450
#2416

Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models

Hongyang Wei, Shuaizheng Liu, Chun Yuan et al.

ICCV 2025posterarXiv:2503.11073
#2417

Feature Decomposition-Recomposition in Large Vision-Language Model for Few-Shot Class-Incremental Learning

Zongyao Xue, Meina Kan, Shiguang Shan et al.

ICCV 2025poster
#2418

Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini et al.

ICCV 2025posterarXiv:2512.04862
#2419

Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning

yan wang, Da-Wei Zhou, Han-Jia Ye

ICCV 2025posterarXiv:2508.08165
#2420

Can Knowledge be Transferred from Unimodal to Multimodal? Investigating the Transitivity of Multimodal Knowledge Editing

Lingyong Fang, Xinzhong Wang, Depeng depeng wang et al.

ICCV 2025poster
#2421

ConsNoTrainLoRA: Data-driven Weight Initialization of Low-rank Adapters using Constraints

Debasmit Das, Hyoungwoo Park, Munawar Hayat et al.

ICCV 2025posterarXiv:2507.08044
#2422

UDC-VIT: A Real-World Video Dataset for Under-Display Cameras

Kyusu Ahn, JiSoo Kim, Sangik Lee et al.

ICCV 2025highlightarXiv:2501.18545
#2423

Is Visual in-Context Learning for Compositional Medical Tasks within Reach?

Simon Reiß, Zdravko Marinov, Alexander Jaus et al.

ICCV 2025posterarXiv:2507.00868
#2424

Optimal Transport for Brain-Image Alignment: Unveiling Redundancy and Synergy in Neural Information Processing

Yang Xiao, Wang Lu, Jie Ji et al.

ICCV 2025posterarXiv:2503.10663
#2425

Chimera: Improving Generalist Model with Domain-Specific Experts

Tianshuo Peng, Mingsheng Li, Jiakang Yuan et al.

ICCV 2025posterarXiv:2412.05983
#2426

Enhanced Event-based Dense Stereo via Cross-Sensor Knowledge Distillation

Haihao Zhang, Yunjian Zhang, Jianing Li et al.

ICCV 2025poster
#2427

Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information

Zhaoxin Yuan, Shuang Yang, Shiguang Shan et al.

ICCV 2025poster
#2428

ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation

Jimyeong Kim, Jungwon Park, Yeji Song et al.

ICCV 2025highlightarXiv:2507.01496
#2429

Imbalance in Balance: Online Concept Balancing in Generation Models

Yukai Shi, Jiarong Ou, Rui Chen et al.

ICCV 2025posterarXiv:2507.13345
#2430

RALoc: Enhancing Outdoor LiDAR Localization via Rotation Awareness

Yuyang Yang, Wen Li, Sheng Ao et al.

ICCV 2025highlight
#2431

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Siyuan Yan, Ming Hu, Yiwen Jiang et al.

ICCV 2025highlightarXiv:2503.14911
#2432

MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization

Hengjia Li, Lifan Jiang, Xi Xiao et al.

ICCV 2025posterarXiv:2503.12689
#2433

Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests

Fitim Abdullahu, Helmut Grabner

ICCV 2025posterarXiv:2510.13316
#2434

D-Attn: Decomposed Attention for Large Vision-and-Language Model

Chia-Wen Kuo, Sijie Zhu, Fan Chen et al.

ICCV 2025posterarXiv:2502.01906
#2435

Understanding Personal Concept in Open-Vocabulary Semantic Segmentation

Sunghyun Park, Jungsoo Lee, Shubhankar Borse et al.

ICCV 2025posterarXiv:2507.11030
#2436

CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving

Rui Song, Chenwei Liang, Yan Xia et al.

ICCV 2025posterarXiv:2503.06744
#2437

UnZipLoRA: Separating Content and Style from a Single Image

Chang Liu, Viraj Shah, Aiyu Cui et al.

ICCV 2025highlightarXiv:2412.04465
#2438

SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

Yi Qin, Rui Wang, Tao Huang et al.

ICCV 2025posterarXiv:2508.06127
#2439

Semi-supervised Concept Bottleneck Models

Lijie Hu, Tianhao Huang, Huanyi Xie et al.

ICCV 2025posterarXiv:2406.18992
#2440

WINS: Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park, Hyunjae Oh, Mincheol Park et al.

ICCV 2025highlight
#2441

Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation

Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.

ICCV 2025posterarXiv:2504.12436
#2442

ART: Adaptive Relation Tuning for Generalized Relation Prediction

Gopika Sudhakaran, Hikaru Shindo, Patrick Schramowski et al.

ICCV 2025posterarXiv:2507.23543
#2443

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.

ICCV 2025posterarXiv:2507.06230
#2444

No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

Ranran Huang, Krystian Mikolajczyk

ICCV 2025highlightarXiv:2508.01171
#2445

Cooperative Pseudo Labeling for Unsupervised Federated Classification

Kuangpu Guo, Lijun Sheng, Yongcan Yu et al.

ICCV 2025posterarXiv:2510.10100
#2446

MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection

Donghyeon Kwon, Youngseok Yoon, Hyeongseok Son et al.

ICCV 2025poster
#2447

From Sharp to Blur: Unsupervised Domain Adaptation for 2D Human Pose Estimation Under Extreme Motion Blur Using Event Cameras

Youngho Kim, Hoonhee Cho, Kuk-Jin Yoon

ICCV 2025posterarXiv:2507.22438
#2448

Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning

Zhengxuan Wei, Jiajin Tang, Sibei Yang

ICCV 2025posterarXiv:2510.19622
#2449

PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening

Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk et al.

ICCV 2025posterarXiv:2505.23367
#2450

Differentially Private Fine-Tuning of Diffusion Models

Yu-Lin Tsai, Yizhe Li, Zekai Chen et al.

ICCV 2025posterarXiv:2406.01355
#2451

IRGPT: Understanding Real-world Infrared Image with Bi-cross-modal Curriculum on Large-scale Benchmark

Zhe Cao, Jin Zhang, Ruiheng Zhang

ICCV 2025posterarXiv:2507.14449
#2452

One Object, Multiple Lies: A Benchmark for Cross-task Adversarial Attack on Unified Vision-Language Models

Jiale Zhao, XINYANG JIANG, Junyao Gao et al.

ICCV 2025posterarXiv:2507.07709
#2453

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization

Xu Zheng, Yuanhuiyi Lyu, Lutao Jiang et al.

ICCV 2025posterarXiv:2505.06635
#2454

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

Bing Fan, Yunhe Feng, Yapeng Tian et al.

ICCV 2025posterarXiv:2502.07707
#2455

Language-Driven Multi-Label Zero-Shot Learning with Semantic Granularity

Shouwen Wang, Qian Wan, Junbin Gao et al.

ICCV 2025poster
#2456

IM360: Large-scale Indoor Mapping with 360 Cameras

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

ICCV 2025posterarXiv:2502.12545
#2457

PersonaCraft: Personalized and Controllable Full-Body Multi-Human Scene Generation Using Occlusion-Aware 3D-Conditioned Diffusion

Gwanghyun Kim, Suh Jeon Jeon, Seunggyu Lee et al.

ICCV 2025posterarXiv:2411.18068
#2458

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval

Jaeseok Byun, Young Kyun Jang, Seokhyeon Jeong et al.

ICCV 2025poster
#2459

Adaptive Learning of High-Value Regions for Semi-Supervised Medical Image Segmentation

Tao Lei, Ziyao Yang, Xingwu wang et al.

ICCV 2025poster
#2460

Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning

Xinyao Liu, Diping Song

ICCV 2025posterarXiv:2507.17539
#2461

Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines

Jiayuan Chen, Thai-Hoang Pham, Yuanlong Wang et al.

ICCV 2025highlightarXiv:2507.10737
#2462

Spectral Sensitivity Estimation with an Uncalibrated Diffraction Grating

Lilika Makabe, Hiroaki Santo, Fumio Okura et al.

ICCV 2025posterarXiv:2508.00330
#2463

TransiT: Transient Transformer for Non-line-of-sight Videography

Ruiqian Li, Siyuan Shen, Suan Xia et al.

ICCV 2025posterarXiv:2503.11328
#2464

On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations

Amir Mehrpanah, Matteo Gamba, Kevin Smith et al.

ICCV 2025posterarXiv:2508.10490
#2465

FedDifRC: Unlocking the Potential of Text-to-Image Diffusion Models in Heterogeneous Federated Learning

Huan Wang, Haoran Li, Huaming Chen et al.

ICCV 2025posterarXiv:2507.06482
#2466

Category-Specific Selective Feature Enhancement for Long-Tailed Multi-Label Image Classification

Ruiqi Du, Xu Tang, Xiangrong Zhang et al.

ICCV 2025poster
#2467

Registration beyond Points: General Affine Subspace Alignment via Geodesic Distance on Grassmann Manifold

Jaeho Shin, Hyeonjae Gil, Junwoo Jang et al.

ICCV 2025highlightarXiv:2507.17998
#2468

An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval

Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim et al.

ICCV 2025posterarXiv:2406.09188
#2469

Find a Scapegoat: Poisoning Membership Inference Attack and Defense to Federated Learning

Wenjin Mo, Zhiyuan Li, Minghong Fang et al.

ICCV 2025posterarXiv:2507.00423
#2470

To Label or Not to Label: PALM – A Predictive Model for Evaluating Sample Efficiency in Active Learning Models

Julia Machnio, Mads Nielsen, Mostafa Mehdipour Ghazi

ICCV 2025posterarXiv:2507.15381
#2471

Personalized Federated Learning under Local Supervision

Qiqi Liu, Jiaqiang Li, Yuchen Liu et al.

ICCV 2025poster
#2472

Radiant Foam: Real-Time Differentiable Ray Tracing

Shrisudhan Govindarajan, Daniel Rebain, Kwang Moo Yi et al.

ICCV 2025highlightarXiv:2502.01157
#2473

COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition

Ryan Rabinowitz, Steve Cruz, Walter Scheirer et al.

ICCV 2025posterarXiv:2508.01087
#2474

Information Density Principle for MLLM Benchmarks

Chunyi Li, Xiaozhe Li, Zicheng Zhang et al.

ICCV 2025posterarXiv:2503.10079
#2475

Perspective-Aware Teaching: Adapting Knowledge for Heterogeneous Distillation

Jhe-Hao Lin, Yi Yao, Chan-Feng Hsu et al.

ICCV 2025posterarXiv:2501.08885
#2476

Is Meta-Learning Out? Rethinking Unsupervised Few-Shot Classification with Limited Entropy

Yunchuan Guan, Yu Liu, Ke Zhou et al.

ICCV 2025posterarXiv:2509.13185
#2477

Long-Tailed Classification with Multi-Granularity Semantics

Yuting Liu, Liu Yang, Yu Wang

ICCV 2025poster
#2478

ToolVQA: A Dataset for Multi-step Reasoning VQA with External Tools

Shaofeng Yin, Ting Lei, Yang Liu

ICCV 2025posterarXiv:2508.03284
#2479

FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection

Brian Isaac-Medina, Mauricio Che, Yona Falinie A. Gaus et al.

ICCV 2025posterarXiv:2412.01596
#2480

Adversarial Purification via Super-Resolution and Diffusion

Mincheol Park, Cheonjun Park, Seungseop Lim et al.

ICCV 2025poster
#2481

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Xianfu Cheng, Wei Zhang, Shiwei Zhang et al.

ICCV 2025posterarXiv:2502.13059
#2482

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Jiaxin Ai, Pengfei Zhou, xu Pan et al.

ICCV 2025posterarXiv:2503.06553
#2483

Failure Cases Are Better Learned But Boundary Says Sorry: Facilitating Smooth Perception Change for Accuracy-Robustness Trade-Off in Adversarial Training

Yanyun Wang, Li Liu

ICCV 2025posterarXiv:2508.02186
#2484

Secure On-Device Video OOD Detection Without Backpropagation

Li Li, Peilin Cai, Yuxiao Zhou et al.

ICCV 2025posterarXiv:2503.06166
#2485

Learning Counterfactually Decoupled Attention for Open-World Model Attribution

Yu Zheng, Boyang Gong, Fanye Kong et al.

ICCV 2025posterarXiv:2506.23074
#2486

Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning

Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.

ICCV 2025posterarXiv:2507.21494
#2487

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation

Zixin Wang, Dong Gong, Sen Wang et al.

ICCV 2025posterarXiv:2410.14729
#2488

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

Qifan Yu, Zhebei Shen, Zhongqi Yue et al.

ICCV 2025highlightarXiv:2412.06293
#2489

Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations

Chongjie Si, Zhiyi Shi, Xuehui Wang et al.

ICCV 2025posterarXiv:2504.00851
#2490

Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration

Dongyue Wu, Zilin Guo, Jialong Zuo et al.

ICCV 2025posterarXiv:2506.23674
#2491

CIARD: Cyclic Iterative Adversarial Robustness Distillation

Liming Lu, Shuchao Pang, Xu Zheng et al.

ICCV 2025posterarXiv:2509.12633
#2492

InfoBridge: Balanced Multimodal Integration through Conditional Dependency Modeling

Chenxin Li, Yifan Liu, Panwang Pan et al.

ICCV 2025poster
#2493

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning

Zhengzhuo Xu, Sinan Du, Yiyan Qi et al.

ICCV 2025posterarXiv:2512.00305
#2494

DiffRefine: Diffusion-based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection

Sangyun Shin, Yuhang He, Xinyu Hou et al.

ICCV 2025highlight
#2495

Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features

Shangbo Wu, Yu-an Tan, Ruinan Ma et al.

ICCV 2025posterarXiv:2506.21046
#2496

Divide-and-Conquer for Enhancing Unlabeled Learning, Stability, and Plasticity in Semi-supervised Continual Learning

Yue Duan, Taicai Chen, Lei Qi et al.

ICCV 2025posterarXiv:2508.05316
#2497

Confound from All Sides, Distill with Resilience: Multi-Objective Adversarial Paths to Zero-Shot Robustness

Junhao Dong, Jiao Liu, Xinghua Qu et al.

ICCV 2025highlight
#2498

Mitigating Object Hallucinations via Sentence-Level Early Intervention

Shangpin Peng, Senqiao Yang, Li Jiang et al.

ICCV 2025posterarXiv:2507.12455
#2499

Open-Unfairness Adversarial Mitigation for Generalized Deepfake Detection

Zhaoyang Li, Zhu Teng, Baopeng Zhang et al.

ICCV 2025poster
#2500

Spatial Preference Rewarding for MLLMs Spatial Understanding

Han Qiu, Peng Gao, Lewei Lu et al.

ICCV 2025posterarXiv:2510.14374
#2501

Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue

Guohao Sun, Can Qin, Yihao Feng et al.

ICCV 2025poster
#2502

A Framework for Double-Blind Federated Adaptation of Foundation Models

Nurbek Tastan, Karthik Nandakumar

ICCV 2025posterarXiv:2502.01289
#2503

MMOne: Representing Multiple Modalities in One Scene

Zhifeng Gu, Bing WANG

ICCV 2025posterarXiv:2507.11129
#2504

VisionMath: Vision-Form Mathematical Problem-Solving

Zongyang Ma, Yuxin Chen, Ziqi Zhang et al.

ICCV 2025poster
#2505

Quanta Neural Networks: From Photons to Perception

Varun Sundar, Tianyi Zhang, Sacha Jungerman et al.

ICCV 2025poster
#2506

OpenSubstance: A High-quality Measured Dataset of Multi-View and -Lighting Images and Shapes

Fan Pei, jinchen bai, Xiang Feng et al.

ICCV 2025poster
#2507

VGMamba: Attribute-to-Location Clue Reasoning for Quantity-Agnostic 3D Visual Grounding

Zhu Yihang, Jinhao Zhang, Yuxuan Wang et al.

ICCV 2025poster
#2508

RMultiplex200K: Toward Reliable Multimodal Process Supervision for Visual Language Models on Telecommunications

Sijia Chen, Bin Song

ICCV 2025poster
#2509

EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients

meihan wu, Tao Chang, Cui Miao et al.

ICCV 2025posterarXiv:2412.00334
#2510

Target Bias Is All You Need: Zero-Shot Debiasing of Vision-Language Models with Bias Corpus

Taeuk Jang, Hoin Jung, Xiaoqian Wang

ICCV 2025poster
#2511

Multi-Cache Enhanced Prototype Learning for Test-Time Generalization of Vision-Language Models

Xinyu Chen, Haotian Zhai, Can Zhang et al.

ICCV 2025posterarXiv:2508.01225
#2512

Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization

Kesen Zhao, Beier Zhu, Qianru Sun et al.

ICCV 2025posterarXiv:2504.18397
#2513

TRNAS: A Training-Free Robust Neural Architecture Search

Yeming Yang, Qingling Zhu, Jianping Luo et al.

ICCV 2025poster
#2514

The Inter-Intra Modal Measure: A Predictive Lens on Fine-Tuning Outcomes in Vision-Language Models

Laura Niss, Kevin Vogt-Lowell, Theodoros Tsiligkaridis

ICCV 2025posterarXiv:2407.15731
#2515

What to Distill? Fast Knowledge Distillation with Adaptive Sampling

Byungchul Chae, Seonyeong Heo

ICCV 2025highlight
#2516

Generative Modeling of Shape-Dependent Self-Contact Human Poses

Takehiko Ohkawa, Jihyun Lee, Shunsuke Saito et al.

ICCV 2025posterarXiv:2509.23393
#2517

Met2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems

Shaohan Li, Hao Yang, Min Chen et al.

ICCV 2025poster
#2518

Beyond RGB: Adaptive Parallel Processing for RAW Object Detection

Shani Gamrian, Hila Barel, Feiran Li et al.

ICCV 2025posterarXiv:2503.13163
#2519

PoseSyn: Synthesizing Diverse 3D Pose Data from In-the-Wild 2D Data

CHANGHEE YANG, Hyeonseop Song, Seokhun Choi et al.

ICCV 2025posterarXiv:2503.13025
#2520

TorchAdapt: Towards Light-Agnostic Real-Time Visual Perception

Khurram Azeem Hashmi, Karthik Suresh, Didier Stricker et al.

ICCV 2025poster
#2521

Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling

Christopher Xie, Armen Avetisyan, Henry Howard-Jenkins et al.

ICCV 2025highlightarXiv:2503.11806
#2522

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.

ICCV 2025posterarXiv:2507.22825
#2523

Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design

Yuhao Sun, Yihua Zhang, Gaowen Liu et al.

ICCV 2025posterarXiv:2508.10065
#2524

Real3D: Towards Scaling Large Reconstruction Models with Real Images

Hanwen Jiang, Qixing Huang, Georgios Pavlakos

ICCV 2025poster
#2525

Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels

Olaf Dünkel, Thomas Wimmer, Christian Theobalt et al.

ICCV 2025posterarXiv:2506.05312
#2526

CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy

Dongyoung Kim, Mahmoud Afifi, Dongyun Kim et al.

ICCV 2025posterarXiv:2504.07959
#2527

Zero-shot Inexact CAD Model Alignment from a Single Image

Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.

ICCV 2025posterarXiv:2507.03292
#2528

Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer

Hai Wu, Hongwei Lin, Xusheng Guo et al.

ICCV 2025poster
#2529

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation

Pingrui Zhang, Xianqiang Gao, Yuhan Wu et al.

ICCV 2025posterarXiv:2503.11081
#2530

OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection

Heng Su, Mengying Xie, Nieqing Cao et al.

ICCV 2025poster
#2531

X-Capture: An Open-Source Portable Device for Multi-Sensory Learning

Samuel Clarke, Suzannah Wistreich, Yanjie Ze et al.

ICCV 2025posterarXiv:2504.02318
#2532

GloPER: Unsupervised Animal Pattern Extraction from Local Reconstruction

Bowen Chen, Yun Sing Koh, Gillian Dobbie

ICCV 2025poster
#2533

Focal Plane Visual Feature Generation and Matching on a Pixel Processor Array

Hongyi Zhang, Laurie Bose, Jianing Chen et al.

ICCV 2025poster
#2534

Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation

Hongyu Wen, Yiming Zuo, Venkat Subramanian et al.

ICCV 2025posterarXiv:2503.11633
#2535

AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning

Dejie Yang, Zijing Zhao, Yang Liu

ICCV 2025posterarXiv:2508.07626
#2536

Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection

Jae Young Kang, Hoonhee Cho, Kuk-Jin Yoon

ICCV 2025posterarXiv:2508.02288
#2537

PlaneRAS: Learning Planar Primitives for 3D Plane Recovery

Fang Zhang, Wenzhao Zheng, Linqing Zhao et al.

ICCV 2025poster
#2538

3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark

Wufei Ma, Haoyu Chen, Guofeng Zhang et al.

ICCV 2025posterarXiv:2412.07825
#2539

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Xuying Zhang, Yutong Liu, Yangguang Li et al.

ICCV 2025posterarXiv:2412.16919
#2540

Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs

Xuange Zhang, Dengjie Li, Bo Liu et al.

ICCV 2025poster
#2541

HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation

Yulin Wang, Mengting Hu, Hongli Li et al.

ICCV 2025highlightarXiv:2510.10177
#2542

MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs

Erik Daxberger, Nina Wenzel, David Griffiths et al.

ICCV 2025posterarXiv:2503.13111
#2543

Understanding Flatness in Generative Models: Its Role and Benefits

Taehwan Lee, Kyeongkook Seo, Jaejun Yoo et al.

ICCV 2025posterarXiv:2503.11078
#2544

Image-Guided Shape-from-Template Using Mesh Inextensibility Constraints

Dinh-Vinh-Thuy Tran, Ruochen Chen, Shaifali Parashar

ICCV 2025posterarXiv:2507.22699
#2545

PHD: Personalized 3D Human Body Fitting with Point Diffusion

Hsuan-I Ho, Chen Guo, Po-Chen Wu et al.

ICCV 2025posterarXiv:2508.21257
#2546

ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion

AO LI, Jinpeng Liu, Yixuan Zhu et al.

ICCV 2025posterarXiv:2509.07920
#2547

MonoSOWA: Scalable monocular 3D Object detector Without human Annotations

Jan Skvrna, Lukas Neumann

ICCV 2025posterarXiv:2501.09481
#2548

Estimating 2D Camera Motion with Hybrid Motion Basis

Haipeng Li, Tianhao Zhou, Zhanglei Yang et al.

ICCV 2025posterarXiv:2507.22480
#2549

TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras

Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski

ICCV 2025posterarXiv:2508.00913
#2550

Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision

Xiao Fang, Minhyek Jeon, Zheyang Qin et al.

ICCV 2025posterarXiv:2507.20976
#2551

Revisiting Image Fusion for Multi-Illuminant White-Balance Correction

David Serrano, Aditya Arora, Luis Herranz et al.

ICCV 2025posterarXiv:2503.14774
#2552

Uncertainty-Aware Gradient Stabilization for Small Object Detection

Huixin Sun, Yanjing Li, Linlin Yang et al.

ICCV 2025posterarXiv:2303.01803
#2553

CryoFastAR: Fast Cryo-EM Ab initio Reconstruction Made Easy

Jiakai Zhang, Shouchen Zhou, Haizhao Dai et al.

ICCV 2025posterarXiv:2506.05864
#2554

Event-guided Unified Framework for Low-light Video Enhancement, Frame Interpolation, and Deblurring

Taewoo Kim, Kuk-Jin Yoon

ICCV 2025poster
#2555

Spatial Alignment and Temporal Matching Adapter for Video-Radar Remote Physiological Measurement

Qian Liang, Ruixu Geng, Jinbo Chen et al.

ICCV 2025poster
#2556

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Yusuke Hirota, Ryo Hachiuma, Boyi Li et al.

ICCV 2025posterarXiv:2509.07596
#2557

SEHDR: Single-Exposure HDR Novel View Synthesis via 3D Gaussian Bracketing

Yiyu Li, Haoyuan Wang, Ke Xu et al.

ICCV 2025posterarXiv:2509.20400
#2558

MaGS: Reconstructing and Simulating Dynamic 3D Objects with Mesh-adsorbed Gaussian Splatting

Shaojie Ma, Yawei Luo, Wei Yang et al.

ICCV 2025highlightarXiv:2406.01593
#2559

CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector

Abhinav Kumar, Yuliang Guo, Zhihao Zhang et al.

ICCV 2025posterarXiv:2508.11185
#2560

Learning on the Go: A Meta-learning Object Navigation Model

Xiaorong Qin, Xinhang Song, Sixian Zhang et al.

ICCV 2025poster
#2561

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Zizhang Li, Hong-Xing Yu, Wei Liu et al.

ICCV 2025highlightarXiv:2505.18151
#2562

Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

Kaixuan Jiang, Yang Liu, Weixing Chen et al.

ICCV 2025posterarXiv:2503.11117
#2563

Not all Views are Created Equal: Analyzing Viewpoint Instabilities in Vision Foundation Models

Mateusz Michalkiewicz, Xinyue Bai, Mahsa Baktashmotlagh et al.

ICCV 2025posterarXiv:2412.19920
#2564

CHROME: Clothed Human Reconstruction with Occlusion-Resilience and Multiview-Consistency from a Single Image

Arindam Dutta, Meng Zheng, Zhongpai Gao et al.

ICCV 2025highlightarXiv:2503.15671
#2565

ReCoT: Reflective Self-Correction Training for Mitigating Confirmation Bias in Large Vision-Language Models

Mengxue Qu, Yibo Hu, Kunyang Han et al.

ICCV 2025poster
#2566

OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration

Yiming Zuo, Willow Yang, Zeyu Ma et al.

ICCV 2025posterarXiv:2411.19278
#2567

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

Yihan Cao, Jiazhao Zhang, Zhinan Yu et al.

ICCV 2025posterarXiv:2412.10439
#2568

Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification

Wajahat Khalid, Bin Liu, Xulin Li et al.

ICCV 2025poster
#2569

WalkVLM: Aid Visually Impaired People Walking by Vision Language Model

Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.

ICCV 2025poster
#2570

VIGFace: Virtual Identity Generation for Privacy-Free Face Recognition Dataset

Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam et al.

ICCV 2025poster
#2571

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Lixing Xiao, Shunlin Lu, Huaijin Pi et al.

ICCV 2025posterarXiv:2503.15451
#2572

Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection

Giacomo D'Amicantonio, Snehashis Majhi, Quan Kong et al.

ICCV 2025highlightarXiv:2508.06318
#2573

What If: Understanding Motion Through Sparse Interactions

Stefan A. Baumann, Nick Stracke, Timy Phan et al.

ICCV 2025poster
#2574

Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition

Zefeng Qian, Xincheng Yao, Yifei Huang et al.

ICCV 2025posterarXiv:2507.16287
#2575

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

Liyuan Deng, Yunpeng Bai, Yongkang Dai et al.

ICCV 2025posterarXiv:2511.17647
#2576

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.

ICCV 2025posterarXiv:2508.14187
#2577

EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models

Yufei Cai, Hu Han, Yuxiang Wei et al.

ICCV 2025posterarXiv:2503.19369
#2578

Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening

Hebaixu Wang, Jiayi Ma

ICCV 2025poster
#2579

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion

Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.

ICCV 2025poster
#2580

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

Dat NGUYEN, Marcella Astrid, Anis Kacem et al.

ICCV 2025posterarXiv:2501.01184
#2581

Multi-modal Identity Extraction

Ryan Webster, Teddy Furon

ICCV 2025poster
#2582

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Peng Chen, Pi Bu, Yingyao Wang et al.

ICCV 2025posterarXiv:2503.09527
#2583

Blind Noisy Image Deblurring Using Residual Guidance Strategy

Heyan Liu, Jianing Sun, Jun Liu et al.

ICCV 2025poster
#2584

Drawing Developmental Trajectory from Cortical Surface Reconstruction

WENXUAN WU, ruowen qu, Zhongliang Liu et al.

ICCV 2025poster
#2585

Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.

ICCV 2025posterarXiv:2503.13859
#2586

DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads

Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.

ICCV 2025poster
#2587

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

ICCV 2025posterarXiv:2506.23263
#2588

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

ICCV 2025highlightarXiv:2506.22802
#2589

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025posterarXiv:2503.19457
#2590

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.

ICCV 2025highlight
#2591

Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene

Donggeun Lim, Jinseok Bae, Inwoo Hwang et al.

ICCV 2025posterarXiv:2507.19232
#2592

Fast Image Super-Resolution via Consistency Rectified Flow

Jiaqi Xu, Wenbo Li, Haoze Sun et al.

ICCV 2025poster
#2593

Event-guided HDR Reconstruction with Diffusion Priors

Yixin Yang, jiawei zhang, Yang Zhang et al.

ICCV 2025poster
#2594

AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

Yilin Wei, Mu Lin, Yuhao Lin et al.

ICCV 2025posterarXiv:2503.07360
#2595

Robust Adverse Weather Removal via Spectral-based Spatial Grouping

Yuhwan Jeong, Yunseo Yang, Youngho Yoon et al.

ICCV 2025posterarXiv:2507.22498
#2596

Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image

Shuang Xu, Zixiang Zhao, Haowen Bai et al.

ICCV 2025posterarXiv:2412.04201
#2597

VideoSetDiff: Identifying and Reasoning Similarities and Differences in Similar Videos

YUE QIU, Yanjun Sun, Takuma Yagi et al.

ICCV 2025poster
#2598

HADES: Human Avatar with Dynamic Explicit Hair Strands

Zhanfeng Liao, Hanzhang Tu, Cheng Peng et al.

ICCV 2025poster
#2599

DreamRelation: Relation-Centric Video Customization

Yujie Wei, Shiwei Zhang, Hangjie Yuan et al.

ICCV 2025posterarXiv:2503.07602
#2600

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Hao Li, Xiang Chen, Jiangxin Dong et al.

ICCV 2025posterarXiv:2412.01427