Most Cited ICCV "zero-shot foundation models" Papers

2,701 papers found • Page 8 of 14

#1401

Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion

shengyuan zhang, An Zhao, Ling Yang et al.

ICCV 2025posterarXiv:2412.03515
#1402

SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM

Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger

ICCV 2025highlightarXiv:2504.00139
#1403

RogSplat: Robust Gaussian Splatting via Generative Priors

Hanyang Kong, Xingyi Yang, Xinchao Wang

ICCV 2025poster
#1404

FedAGC: Federated Continual Learning with Asymmetric Gradient Correction

Chengchao Zhang, Fanhua Shang, Hongying Liu et al.

ICCV 2025poster
#1405

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Xiaokun Feng, Shiyu Hu, Xuchen Li et al.

ICCV 2025highlightarXiv:2507.19875
#1406

Joint Learning of Pose Regression and Denoising Diffusion with Score Scaling Sampling for Category-level 6D Pose Estimation

Seunghyun Lee, Tae-Kyun Kim

ICCV 2025posterarXiv:2510.04125
#1407

Intra-modal and Cross-modal Synchronization for Audio-visual Deepfake Detection and Temporal Localization

Ashutosh Anshul, Shreyas Gopal, Deepu Rajan et al.

ICCV 2025poster
#1408

Enhancing Adversarial Transferability by Balancing Exploration and Exploitation with Gradient-Guided Sampling

Zenghao Niu, Weicheng Xie, Siyang Song et al.

ICCV 2025posterarXiv:2511.00411
#1409

CWNet: Causal Wavelet Network for Low-Light Image Enhancement

Tongshun Zhang, Pingping Liu, Yubing Lu et al.

ICCV 2025posterarXiv:2507.10689
#1410

MinCD-PnP: Learning 2D-3D Correspondences with Approximate Blind PnP

Pei An, Jiaqi Yang, Muyao Peng et al.

ICCV 2025posterarXiv:2507.15257
#1411

SCORE: Scene Context Matters in Open-Vocabulary Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Huaiyuan Qin et al.

ICCV 2025highlightarXiv:2507.12857
#1412

Federated Representation Angle Learning

Liping Yi, Han Yu, Gang Wang et al.

ICCV 2025poster
#1413

GeoDistill: Geometry-Guided Self-Distillation for Weakly Supervised Cross-View Localization

Shaowen Tong, Zimin Xia, Alexandre Alahi et al.

ICCV 2025posterarXiv:2507.10935
#1414

CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval

Zelong Sun, Dong Jing, Zhiwu Lu

ICCV 2025posterarXiv:2502.20826
#1415

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Ho Kei Cheng, Alex Schwing

ICCV 2025posterarXiv:2503.10636
#1416

Diffusion-based Source-biased Model for Single Domain Generalized Object Detection

Han Jiang, Wenfei Yang, Tianzhu Zhang et al.

ICCV 2025poster
#1417

DepthSync: Diffusion Guidance-Based Depth Synchronization for Scale- and Geometry-Consistent Video Depth Estimation

Yue-Jiang Dong, Wang Zhao, Jiale Xu et al.

ICCV 2025posterarXiv:2507.01603
#1418

Measuring the Impact of Rotation Equivariance on Aerial Object Detection

Xiuyu Wu, Xinhao Wang, Xiubin Zhu et al.

ICCV 2025posterarXiv:2507.09896
#1419

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

Wenjie Zhuo, Fan Ma, Hehe Fan

ICCV 2025posterarXiv:2411.18303
#1420

Enhanced Pansharpening via Quaternion Spatial-Spectral Interactions

Dong Li, Chunhui Luo, Yuanfei Bao et al.

ICCV 2025poster
#1421

Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing

Yongxin Guo, Lin Wang, Xiaoying Tang et al.

ICCV 2025posterarXiv:2405.16233
#1422

Instance-Level Video Depth in Groups Beyond Occlusions

Yuan Liang, Yang Zhou, Ziming Sun et al.

ICCV 2025poster
#1423

Flow Stochastic Segmentation Networks

Fabio De Sousa Ribeiro, Omar Todd, Charles Jones et al.

ICCV 2025posterarXiv:2507.18838
#1424

Future-Aware Interaction Network For Motion Forecasting

Shijie Li, Chunyu Liu, Xun Xu et al.

ICCV 2025posterarXiv:2503.06565
#1425

From Gaze to Movement: Predicting Visual Attention for Autonomous Driving Human-Machine Interaction based on Programmatic Imitation Learning

Yexin Huang, Yongbin Lin, Lishengsa Yue et al.

ICCV 2025poster
#1426

ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Predictions

Dubing Chen, Jin Fang, Wencheng Han et al.

ICCV 2025posterarXiv:2411.07725
#1427

DreamCube: RGB-D Panorama Generation via Multi-plane Synchronization

Yukun Huang, Yanning Zhou, Jianan Wang et al.

ICCV 2025poster
#1428

From Enhancement to Understanding: Build a Generalized Bridge for Low-light Vision via Semantically Consistent Unsupervised Fine-tuning

Sen Wang, Shao Zeng, Tianjun Gu et al.

ICCV 2025posterarXiv:2507.08380
#1429

ScanEdit: Hierarchically-Guided Functional 3D Scan Editing

Mohamed El Amine Boudjoghra, Ivan Laptev, Angela Dai

ICCV 2025posterarXiv:2504.15049
#1430

Optical Model-Driven Sharpness Mapping for Autofocus in Small Depth-of-Field and Severe Defocus Scenarios

Chen-Liang Fan, Mingpei Cao, Chih-Chien Hung et al.

ICCV 2025poster
#1431

HyPiDecoder: Hybrid Pixel Decoder for Efficient Segmentation and Detection

Fengzhe Zhou, Humphrey Shi

ICCV 2025poster
#1432

MMAD: Multi-label Micro-Action Detection in Videos

Kun Li, pengyu Liu, Dan Guo et al.

ICCV 2025posterarXiv:2407.05311
#1433

G2D: Boosting Multimodal Learning with Gradient-Guided Distillation

Mohammed Rakib, Arunkumar Bagavathi

ICCV 2025poster
#1434

Unified Video Generation via Next-Set Prediction in Continuous Domain

Zhanzhou Feng, Qingpei Guo, Xinyu Xiao et al.

ICCV 2025poster
#1435

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Feihong Yan, qingyan wei, Jiayi Tang et al.

ICCV 2025posterarXiv:2503.12450
#1436

Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models

Hongyang Wei, Shuaizheng Liu, Chun Yuan et al.

ICCV 2025posterarXiv:2503.11073
#1437

Omni-scene Perception-oriented Point Cloud Geometry Enhancement for Coordinate Quantization

Wang Liu, Wei Gao

ICCV 2025poster
#1438

Auto-Regressive Transformation for Image Alignment

Kanggeon Lee, Soochahn Lee, Kyoung Mu Lee

ICCV 2025posterarXiv:2505.04864
#1439

Training-Free Industrial Defect Generation with Diffusion Models

Ruyi Xu, Yen-Tzu Chiu, Tai-I Chen et al.

ICCV 2025poster
#1440

Feature Decomposition-Recomposition in Large Vision-Language Model for Few-Shot Class-Incremental Learning

Zongyao Xue, Meina Kan, Shiguang Shan et al.

ICCV 2025poster
#1441

TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

Ruidong Chen, honglin guo, Lanjun Wang et al.

ICCV 2025posterarXiv:2503.07389
#1442

Can Knowledge be Transferred from Unimodal to Multimodal? Investigating the Transitivity of Multimodal Knowledge Editing

Lingyong Fang, Xinzhong Wang, Depeng depeng wang et al.

ICCV 2025poster
#1443

Zero-Shot Composed Image Retrieval via Dual-Stream Instruction-Aware Distillation

Wenliang Zhong, Rob Barton, Weizhi An et al.

ICCV 2025poster
#1444

Token Activation Map to Visually Explain Multimodal LLMs

Yi Li, Hualiang Wang, Xinpeng Ding et al.

ICCV 2025posterarXiv:2506.23270
#1445

Learning Neural Scene Representation from iToF Imaging

Wenjie Chang, Hanzhi Chang, Yueyi Zhang et al.

ICCV 2025poster
#1446

Multi-Modal Multi-Task Unified Embedding Model (M3T-UEM): A Task-Adaptive Representation Learning Framework

Rohan Sharma, Changyou Chen, Feng-Ju Chang et al.

ICCV 2025poster
#1447

ConsNoTrainLoRA: Data-driven Weight Initialization of Low-rank Adapters using Constraints

Debasmit Das, Hyoungwoo Park, Munawar Hayat et al.

ICCV 2025posterarXiv:2507.08044
#1448

Task-Aware Prompt Gradient Projection for Parameter-Efficient Tuning Federated Class-Incremental Learning

Hualong Ke, Yachao Zhang, Jiangming Shi et al.

ICCV 2025poster
#1449

UDC-VIT: A Real-World Video Dataset for Under-Display Cameras

Kyusu Ahn, JiSoo Kim, Sangik Lee et al.

ICCV 2025highlightarXiv:2501.18545
#1450

InvRGB+L: Inverse Rendering of Complex Scenes with Unified Color and LiDAR Reflectance Modeling

Xiaoxue Chen, Bhargav Chandaka, Chih-Hao Lin et al.

ICCV 2025posterarXiv:2507.17613
#1451

Is Visual in-Context Learning for Compositional Medical Tasks within Reach?

Simon Reiß, Zdravko Marinov, Alexander Jaus et al.

ICCV 2025posterarXiv:2507.00868
#1452

SpinMeRound: Consistent Multi-View Identity Generation Using Diffusion Models

Stathis Galanakis, Alexandros Lattas, Stylianos Moschoglou et al.

ICCV 2025posterarXiv:2504.10716
#1453

Geminio: Language-Guided Gradient Inversion Attacks in Federated Learning

Junjie Shan, Ziqi Zhao, Jialin Lu et al.

ICCV 2025posterarXiv:2411.14937
#1454

ObjectMate: A Recurrence Prior for Object Insertion and Subject-Driven Generation

Daniel Winter, Asaf Shul, Matan Cohen et al.

ICCV 2025highlightarXiv:2412.08645
#1455

Active Learning Meets Foundation Models: Fast Remote Sensing Data Annotation for Object Detection

Marvin Burges, Philipe Dias, Dalton Lunga et al.

ICCV 2025poster
#1456

Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks

Jiawei Wang, Yushen Zuo, Yuanjun Chai et al.

ICCV 2025posterarXiv:2504.01308
#1457

Optimal Transport for Brain-Image Alignment: Unveiling Redundancy and Synergy in Neural Information Processing

Yang Xiao, Wang Lu, Jie Ji et al.

ICCV 2025posterarXiv:2503.10663
#1458

MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost

Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.

ICCV 2025posterarXiv:2509.01157
#1459

Intervening in Black Box: Concept Bottleneck Model for Enhancing Human Neural Network Mutual Understanding

Nuoye Xiong, Anqi Dong, Ning Wang et al.

ICCV 2025posterarXiv:2506.22803
#1460

Resolving Token-Space Gradient Conflicts: Token Space Manipulation for Transformer-Based Multi-Task Learning

Wooseong Jeong, Kuk-Jin Yoon

ICCV 2025posterarXiv:2507.07485
#1461

DisCoPatch: Taming Adversarially-driven Batch Statistics for Improved Out-of-Distribution Detection

Francisco Caetano, Christiaan Viviers, Luis Zavala-Mondragón et al.

ICCV 2025posterarXiv:2501.08005
#1462

Scaling and Taming Adversarial Training with Synthetic Data

Juntao Wu, Xianting Huang, Yu Chen et al.

ICCV 2025poster
#1463

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Zihan Ding, Chi Jin, Difan Liu et al.

ICCV 2025posterarXiv:2412.15689
#1464

Music Grounding by Short Video

Zijie Xin, Minquan Wang, Jingyu Liu et al.

ICCV 2025posterarXiv:2408.16990
#1465

Fewer Denoising Steps or Cheaper Per-Step Inference: Towards Compute-Optimal Diffusion Model Deployment

Zhenbang Du, Yonggan Fu, Lifu Wang et al.

ICCV 2025posterarXiv:2508.06160
#1466

Your Text Encoder Can Be An Object-Level Watermarking Controller

Naresh Kumar Devulapally, Mingzhen Huang, Vishal Asnani et al.

ICCV 2025posterarXiv:2503.11945
#1467

Chimera: Improving Generalist Model with Domain-Specific Experts

Tianshuo Peng, Mingsheng Li, Jiakang Yuan et al.

ICCV 2025posterarXiv:2412.05983
#1468

Perspective-Aware Reasoning in Vision-Language Models via Mental Imagery Simulation

Yuseung Lee, Jihyeon Je, Chanho Park et al.

ICCV 2025posterarXiv:2504.17207
#1469

Enhanced Event-based Dense Stereo via Cross-Sensor Knowledge Distillation

Haihao Zhang, Yunjian Zhang, Jianing Li et al.

ICCV 2025poster
#1470

GauUpdate: New Object Insertion in 3D Gaussian Fields with Consistent Global Illumination

Chengwei REN, Fan Zhang, Liangchao Xu et al.

ICCV 2025poster
#1471

Any-SSR: How Recursive Least Squares Works in Continual Learning of Large Language Model

Kai Tong, Kang Pan, Xiao Zhang et al.

ICCV 2025poster
#1472

Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information

Zhaoxin Yuan, Shuang Yang, Shiguang Shan et al.

ICCV 2025poster
#1473

KOEnsAttack: Towards Efficient Data-Free Black-Box Adversarial Attacks via Knowledge-Orthogonalized Substitute Ensembles

Chaoyong Yang, Jia-Li Yin, Bin Chen et al.

ICCV 2025poster
#1474

Erasing More Than Intended? How Concept Erasure Degrades the Generation of Non-Target Concepts

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic et al.

ICCV 2025posterarXiv:2501.09833
#1475

ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation

Jimyeong Kim, Jungwon Park, Yeji Song et al.

ICCV 2025highlightarXiv:2507.01496
#1476

MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation

Prerit Gupta, Jason Alexander Fotso-Puepi, Zhengyuan Li et al.

ICCV 2025posterarXiv:2508.16911
#1477

Removing Cost Volumes from Optical Flow Estimators

Simon Kiefhaber, Stefan Roth, Simone Schaub-Meyer

ICCV 2025posterarXiv:2510.13317
#1478

PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation

Zhihao ZHU, Yifan Zheng, Siyu Pan et al.

ICCV 2025posterarXiv:2508.05976
#1479

PanSt3R: Multi-view Consistent Panoptic Segmentation

Lojze Zust, Yohann Cabon, Juliette Marrie et al.

ICCV 2025posterarXiv:2506.21348
#1480

GARF: Learning Generalizable 3D Reassembly for Real-World Fractures

Sihang Li, Zeyu Jiang, Grace Chen et al.

ICCV 2025posterarXiv:2504.05400
#1481

Imbalance in Balance: Online Concept Balancing in Generation Models

Yukai Shi, Jiarong Ou, Rui Chen et al.

ICCV 2025posterarXiv:2507.13345
#1482

Progressive Distribution Bridging: Unsupervised Adaptation for Large-scale Pre-trained Models via Adaptive Auxiliary Data

Weinan He, Yixin Zhang, Zilei Wang

ICCV 2025poster
#1483

RALoc: Enhancing Outdoor LiDAR Localization via Rotation Awareness

Yuyang Yang, Wen Li, Sheng Ao et al.

ICCV 2025highlight
#1484

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Siyuan Yan, Ming Hu, Yiwen Jiang et al.

ICCV 2025highlightarXiv:2503.14911
#1485

SummDiff: Generative Modeling of Video Summarization with Diffusion

Kwanseok Kim, Jaehoon Hahm, Sumin Kim et al.

ICCV 2025highlightarXiv:2510.08458
#1486

MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization

Hengjia Li, Lifan Jiang, Xi Xiao et al.

ICCV 2025posterarXiv:2503.12689
#1487

Towards Performance Consistency in Multi-Level Model Collaboration

Qi Li, Runpeng Yu, Xinchao Wang

ICCV 2025poster
#1488

Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests

Fitim Abdullahu, Helmut Grabner

ICCV 2025posterarXiv:2510.13316
#1489

FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling

qiusheng huang, Xiaohui Zhong, Xu Fan et al.

ICCV 2025highlightarXiv:2503.19940
#1490

D-Attn: Decomposed Attention for Large Vision-and-Language Model

Chia-Wen Kuo, Sijie Zhu, Fan Chen et al.

ICCV 2025posterarXiv:2502.01906
#1491

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens

Dongwon Kim, Ju He, Qihang Yu et al.

ICCV 2025posterarXiv:2501.07730
#1492

Understanding Personal Concept in Open-Vocabulary Semantic Segmentation

Sunghyun Park, Jungsoo Lee, Shubhankar Borse et al.

ICCV 2025posterarXiv:2507.11030
#1493

Discovering Divergent Representations between Text-to-Image Models

Lisa Dunlap, Trevor Darrell, Joseph Gonzalez et al.

ICCV 2025posterarXiv:2509.08940
#1494

VRM: Knowledge Distillation via Virtual Relation Matching

Weijia Zhang, Fei Xie, Weidong Cai et al.

ICCV 2025highlightarXiv:2502.20760
#1495

CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving

Rui Song, Chenwei Liang, Yan Xia et al.

ICCV 2025posterarXiv:2503.06744
#1496

Clink! Chop! Thud! - Learning Object Sounds from Real-World Interactions

Mengyu Yang, Yiming Chen, Haozheng Pei et al.

ICCV 2025posterarXiv:2510.02313
#1497

UnZipLoRA: Separating Content and Style from a Single Image

Chang Liu, Viraj Shah, Aiyu Cui et al.

ICCV 2025highlightarXiv:2412.04465
#1498

Learning Visual Proxy for Compositional Zero-Shot Learning

Shiyu Zhang, Cheng Yan, Yang Liu et al.

ICCV 2025posterarXiv:2501.13859
#1499

ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization

Yuanhe Guo, Linxi Xie, Zhuoran Chen et al.

ICCV 2025posterarXiv:2510.18433
#1500

One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-object Trajectory

Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi et al.

ICCV 2025highlightarXiv:2505.23617
#1501

LGA-Net: Learning Local and Global Affinities for Sparse Scribble based Image Colorization

Hongjin Lyu, Bo Li, Paul Rosin et al.

ICCV 2025poster
#1502

Gaze-Language Alignment for Zero-Shot Prediction of Visual Search Targets from Human Gaze Scanpaths

Sounak Mondal, Naveen Sendhilnathan, Ting Zhang et al.

ICCV 2025poster
#1503

SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

Yi Qin, Rui Wang, Tao Huang et al.

ICCV 2025posterarXiv:2508.06127
#1504

Semi-supervised Concept Bottleneck Models

Lijie Hu, Tianhao Huang, Huanyi Xie et al.

ICCV 2025posterarXiv:2406.18992
#1505

DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic

Munish Monga, Vishal Chudasama, Pankaj Wasnik et al.

ICCV 2025posterarXiv:2506.21260
#1506

WINS: Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park, Hyunjae Oh, Mincheol Park et al.

ICCV 2025highlight
#1507

Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation

Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.

ICCV 2025posterarXiv:2504.12436
#1508

ART: Adaptive Relation Tuning for Generalized Relation Prediction

Gopika Sudhakaran, Hikaru Shindo, Patrick Schramowski et al.

ICCV 2025posterarXiv:2507.23543
#1509

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.

ICCV 2025posterarXiv:2507.06230
#1510

DISTIL: Data-Free Inversion of Suspicious Trojan Inputs via Latent Diffusion

Hossein Mirzaei, Zeinab Taghavi, Sepehr Rezaee et al.

ICCV 2025posterarXiv:2507.22813
#1511

Factorized Learning for Temporally Grounded Video-Language Models

Wenzheng Zeng, Difei Gao, Mike Zheng Shou et al.

ICCV 2025posterarXiv:2512.24097
#1512

FedPall: Prototype-based Adversarial and Collaborative Learning for Federated Learning with Feature Drift

yong zhang, Feng Liang, Guanghu Yuan et al.

ICCV 2025posterarXiv:2507.04781
#1513

MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models

Vittorio Pipoli, Alessia Saporita, Federico Bolelli et al.

ICCV 2025poster
#1514

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Zhiqi Ge, Juncheng Li, Xinglei Pang et al.

ICCV 2025posterarXiv:2412.10342
#1515

No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

Ranran Huang, Krystian Mikolajczyk

ICCV 2025highlightarXiv:2508.01171
#1516

External Knowledge Injection for CLIP-Based Class-Incremental Learning

Da-Wei Zhou, Kai-Wen Li, Jingyi Ning et al.

ICCV 2025posterarXiv:2503.08510
#1517

Cooperative Pseudo Labeling for Unsupervised Federated Classification

Kuangpu Guo, Lijun Sheng, Yongcan Yu et al.

ICCV 2025posterarXiv:2510.10100
#1518

MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection

Donghyeon Kwon, Youngseok Yoon, Hyeongseok Son et al.

ICCV 2025poster
#1519

From Sharp to Blur: Unsupervised Domain Adaptation for 2D Human Pose Estimation Under Extreme Motion Blur Using Event Cameras

Youngho Kim, Hoonhee Cho, Kuk-Jin Yoon

ICCV 2025posterarXiv:2507.22438
#1520

Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning

Zhengxuan Wei, Jiajin Tang, Sibei Yang

ICCV 2025posterarXiv:2510.19622
#1521

PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening

Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk et al.

ICCV 2025posterarXiv:2505.23367
#1522

Continual Adaptation: Environment-Conditional Parameter Generation for Object Detection in Dynamic Scenarios

Deng Li, Aming WU, Yang Li et al.

ICCV 2025posterarXiv:2506.24063
#1523

Differentially Private Fine-Tuning of Diffusion Models

Yu-Lin Tsai, Yizhe Li, Zekai Chen et al.

ICCV 2025posterarXiv:2406.01355
#1524

IRGPT: Understanding Real-world Infrared Image with Bi-cross-modal Curriculum on Large-scale Benchmark

Zhe Cao, Jin Zhang, Ruiheng Zhang

ICCV 2025posterarXiv:2507.14449
#1525

One Object, Multiple Lies: A Benchmark for Cross-task Adversarial Attack on Unified Vision-Language Models

Jiale Zhao, XINYANG JIANG, Junyao Gao et al.

ICCV 2025posterarXiv:2507.07709
#1526

Unknown Text Learning for CLIP-based Few-Shot Open-set Recognition

Rui Ma, Qilong Wang, Bing Cao et al.

ICCV 2025poster
#1527

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization

Xu Zheng, Yuanhuiyi Lyu, Lutao Jiang et al.

ICCV 2025posterarXiv:2505.06635
#1528

Hyper-Depth: Hypergraph-based Multi-Scale Representation Fusion for Monocular Depth Estimation

Lin Bie, Siqi Li, Yifan Feng et al.

ICCV 2025poster
#1529

Learning Separable Fine-Grained Representation via Dendrogram Construction from Coarse Labels for Fine-grained Visual Recognition

Guanghui Shi, Xuefeng liang, Wenjie Li et al.

ICCV 2025poster
#1530

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

Bing Fan, Yunhe Feng, Yapeng Tian et al.

ICCV 2025posterarXiv:2502.07707
#1531

Instruction-Grounded Visual Projectors for Continual Learning of Generative Vision-Language Models

Hyundong Jin, Hyung Jin Chang, Eunwoo Kim

ICCV 2025posterarXiv:2508.00260
#1532

PRO-VPT: Distribution-Adaptive Visual Prompt Tuning via Prompt Relocation

Chikai Shang, Mengke Li, Yiqun Zhang et al.

ICCV 2025posterarXiv:2503.06901
#1533

Language-Driven Multi-Label Zero-Shot Learning with Semantic Granularity

Shouwen Wang, Qian Wan, Junbin Gao et al.

ICCV 2025poster
#1534

Generalized Deep Multi-view Clustering via Causal Learning with Partially Aligned Cross-view Correspondence

Xihong Yang, Siwei Wang, Jiaqi Jin et al.

ICCV 2025posterarXiv:2509.16022
#1535

Less is More: Empowering GUI Agent with Context-Aware Simplification

Gongwei Chen, Xurui Zhou, Rui Shao et al.

ICCV 2025highlightarXiv:2507.03730
#1536

IM360: Large-scale Indoor Mapping with 360 Cameras

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

ICCV 2025posterarXiv:2502.12545
#1537

EventUPS: Uncalibrated Photometric Stereo Using an Event Camera

Jinxiu Liang, Bohan Yu, Siqi Yang et al.

ICCV 2025highlight
#1538

Rethinking the Upsampling Process in Light Field Super-Resolution with Spatial-Epipolar Implicit Image Function

Ruixuan Cong, Yu Wang, Mingyuan Zhao et al.

ICCV 2025poster
#1539

Harnessing Input-Adaptive Inference for Efficient VLN

Dongwoo Kang, Akhil Perincherry, Zachary Coalson et al.

ICCV 2025posterarXiv:2508.09262
#1540

When Lighting Deceives: Exposing Vision-Language Models' Illumination Vulnerability Through Illumination Transformation Attack

Hanqing Liu, Shouwei Ruan, Yao Huang et al.

ICCV 2025posterarXiv:2503.06903
#1541

PersonaCraft: Personalized and Controllable Full-Body Multi-Human Scene Generation Using Occlusion-Aware 3D-Conditioned Diffusion

Gwanghyun Kim, Suh Jeon Jeon, Seunggyu Lee et al.

ICCV 2025posterarXiv:2411.18068
#1542

SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis

Xiangyue Zhang, Jianfang Li, Jiaxu Zhang et al.

ICCV 2025posterarXiv:2412.16563
#1543

Guiding Diffusion Models with Adaptive Negative Sampling Without External Resources

Alakh Desai, Nuno Vasconcelos

ICCV 2025poster
#1544

DiffDoctor: Diagnosing Image Diffusion Models Before Treating

Yiyang Wang, Xi Chen, Xiaogang Xu et al.

ICCV 2025posterarXiv:2501.12382
#1545

Training-Free Class Purification for Open-Vocabulary Semantic Segmentation

Qi Chen, Lingxiao Yang, Yun Chen et al.

ICCV 2025posterarXiv:2508.00557
#1546

MA-CIR: A Multimodal Arithmetic Benchmark for Composed Image Retrieval

Jaeseok Byun, Young Kyun Jang, Seokhyeon Jeong et al.

ICCV 2025poster
#1547

Adaptive Learning of High-Value Regions for Semi-Supervised Medical Image Segmentation

Tao Lei, Ziyao Yang, Xingwu wang et al.

ICCV 2025poster
#1548

Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning

Xinyao Liu, Diping Song

ICCV 2025posterarXiv:2507.17539
#1549

Keep Your Friends Close, and Your Enemies Farther: Distance-aware Voxel-wise Contrastive Learning for Semi-supervised Multi-organ Segmentation

Haochen Zhao, Jianwei Niu, Xuefeng Liu et al.

ICCV 2025poster
#1550

Integrating Biological Knowledge for Robust Microscopy Image Profiling on De Novo Cell Lines

Jiayuan Chen, Thai-Hoang Pham, Yuanlong Wang et al.

ICCV 2025highlightarXiv:2507.10737
#1551

Spectral Sensitivity Estimation with an Uncalibrated Diffraction Grating

Lilika Makabe, Hiroaki Santo, Fumio Okura et al.

ICCV 2025posterarXiv:2508.00330
#1552

TransiT: Transient Transformer for Non-line-of-sight Videography

Ruiqian Li, Siyuan Shen, Suan Xia et al.

ICCV 2025posterarXiv:2503.11328
#1553

LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association

Peng Wang, Yongcai Wang, Hualong Cao et al.

ICCV 2025poster
#1554

On the Complexity-Faithfulness Trade-off of Gradient-Based Explanations

Amir Mehrpanah, Matteo Gamba, Kevin Smith et al.

ICCV 2025posterarXiv:2508.10490
#1555

Learning to Inference Adaptively for Multimodal Large Language Models

Zhuoyan Xu, Khoi Nguyen, Preeti Mukherjee et al.

ICCV 2025posterarXiv:2503.10905
#1556

Hierarchical Divide-and-Conquer Grouping for Classification Adaptation of Pre-Trained Models

Ziqian Lu, Yunlong Yu, Qinyue Tong et al.

ICCV 2025poster
#1557

Lark: Low-Rank Updates After Knowledge Localization for Few-shot Class-Incremental Learning

Jinxin Shi, Jiabao Zhao, Yifan Yang et al.

ICCV 2025poster
#1558

Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

Kaichen Zhang, Yifei Shen, Bo Li et al.

ICCV 2025posterarXiv:2411.14982
#1559

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

Jun Zhang, Desen Meng, Zhengming Zhang et al.

ICCV 2025posterarXiv:2412.04449
#1560

FedDifRC: Unlocking the Potential of Text-to-Image Diffusion Models in Heterogeneous Federated Learning

Huan Wang, Haoran Li, Huaming Chen et al.

ICCV 2025posterarXiv:2507.06482
#1561

Category-Specific Selective Feature Enhancement for Long-Tailed Multi-Label Image Classification

Ruiqi Du, Xu Tang, Xiangrong Zhang et al.

ICCV 2025poster
#1562

Registration beyond Points: General Affine Subspace Alignment via Geodesic Distance on Grassmann Manifold

Jaeho Shin, Hyeonjae Gil, Junwoo Jang et al.

ICCV 2025highlightarXiv:2507.17998
#1563

Mind the Gap: Preserving and Compensating for the Modality Gap in CLIP-Based Continual Learning

Linlan Huang, Xusheng Cao, Haori Lu et al.

ICCV 2025highlightarXiv:2507.09118
#1564

An Efficient Post-hoc Framework for Reducing Task Discrepancy of Text Encoders for Composed Image Retrieval

Jaeseok Byun, Seokhyeon Jeong, Wonjae Kim et al.

ICCV 2025posterarXiv:2406.09188
#1565

ZIUM: Zero-Shot Intent-Aware Adversarial Attack on Unlearned Models

Hyun Jun Yook, Ga San Jhun, Cho Hyun et al.

ICCV 2025posterarXiv:2507.21985
#1566

Find a Scapegoat: Poisoning Membership Inference Attack and Defense to Federated Learning

Wenjin Mo, Zhiyuan Li, Minghong Fang et al.

ICCV 2025posterarXiv:2507.00423
#1567

To Label or Not to Label: PALM – A Predictive Model for Evaluating Sample Efficiency in Active Learning Models

Julia Machnio, Mads Nielsen, Mostafa Mehdipour Ghazi

ICCV 2025posterarXiv:2507.15381
#1568

Uncalibrated Structure from Motion on a Sphere

Jonathan Ventura, Viktor Larsson, Fredrik Kahl

ICCV 2025poster
#1569

Personalized Federated Learning under Local Supervision

Qiqi Liu, Jiaqiang Li, Yuchen Liu et al.

ICCV 2025poster
#1570

Noise-Modeled Diffusion Models for Low-Light Spike Image Restoration

Ruonan Liu, Lin Zhu, Xijie Xiang et al.

ICCV 2025highlight
#1571

Prototype-based Contrastive Learning with Stage-wise Progressive Augmentation for Self-Supervised Fine-Grained Learning

BaoFeng Tan, Xiu-Shen Wei, Lin Zhao

ICCV 2025poster
#1572

Radiant Foam: Real-Time Differentiable Ray Tracing

Shrisudhan Govindarajan, Daniel Rebain, Kwang Moo Yi et al.

ICCV 2025highlightarXiv:2502.01157
#1573

COSTARR: Consolidated Open Set Technique with Attenuation for Robust Recognition

Ryan Rabinowitz, Steve Cruz, Walter Scheirer et al.

ICCV 2025posterarXiv:2508.01087
#1574

Information Density Principle for MLLM Benchmarks

Chunyi Li, Xiaozhe Li, Zicheng Zhang et al.

ICCV 2025posterarXiv:2503.10079
#1575

Perspective-Aware Teaching: Adapting Knowledge for Heterogeneous Distillation

Jhe-Hao Lin, Yi Yao, Chan-Feng Hsu et al.

ICCV 2025posterarXiv:2501.08885
#1576

Is Meta-Learning Out? Rethinking Unsupervised Few-Shot Classification with Limited Entropy

Yunchuan Guan, Yu Liu, Ke Zhou et al.

ICCV 2025posterarXiv:2509.13185
#1577

LMM-Det: Make Large Multimodal Models Excel in Object Detection

Jincheng Li, Chunyu Xie, Ji Ao et al.

ICCV 2025posterarXiv:2507.18300
#1578

Attention to the Burtiness in Visual Prompt Tuning!

Yuzhu Wang, Manni Duan, Shu Kong

ICCV 2025poster
#1579

Long-Tailed Classification with Multi-Granularity Semantics

Yuting Liu, Liu Yang, Yu Wang

ICCV 2025poster
#1580

ReTracker: Exploring Image Matching for Robust Online Any Point Tracking

Dongli Tan, Xingyi He, Sida Peng et al.

ICCV 2025highlight
#1581

Multimodal Large Language Model-Guided ISP Hyperparameter Optimization with Dynamic Preference Learning

Xinyu Sun, Zhikun Zhao, congyan lang et al.

ICCV 2025poster
#1582

FEVER-OOD: Free Energy Vulnerability Elimination for Robust Out-of-Distribution Detection

Brian Isaac-Medina, Mauricio Che, Yona Falinie A. Gaus et al.

ICCV 2025posterarXiv:2412.01596
#1583

Adversarial Purification via Super-Resolution and Diffusion

Mincheol Park, Cheonjun Park, Seungseop Lim et al.

ICCV 2025poster
#1584

FedWSQ: Efficient Federated Learning with Weight Standardization and Distribution-Aware Non-Uniform Quantization

Seung-Wook Kim, Seongyeol Kim, Jiah Kim et al.

ICCV 2025posterarXiv:2506.23516
#1585

CMAD: Correlation-Aware and Modalities-Aware Distillation for Multimodal Sentiment Analysis with Missing Modalities

Yan Zhuang, Minhao Liu, Wei Bai et al.

ICCV 2025poster
#1586

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Xianfu Cheng, Wei Zhang, Shiwei Zhang et al.

ICCV 2025posterarXiv:2502.13059
#1587

Revelio: Interpreting and leveraging semantic information in diffusion models

Dahye Kim, Xavier Thomas, Deepti Ghadiyaram

ICCV 2025posterarXiv:2411.16725
#1588

CLIP-GS: Unifying Vision-Language Representation with 3D Gaussian Splatting

Siyu Jiao, Haoye Dong, Yuyang Yin et al.

ICCV 2025posterarXiv:2412.19142
#1589

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Jiaxin Ai, Pengfei Zhou, xu Pan et al.

ICCV 2025posterarXiv:2503.06553
#1590

Failure Cases Are Better Learned But Boundary Says Sorry: Facilitating Smooth Perception Change for Accuracy-Robustness Trade-Off in Adversarial Training

Yanyun Wang, Li Liu

ICCV 2025posterarXiv:2508.02186
#1591

SplatTalk: 3D VQA with Gaussian Splatting

Anh Thai, Kyle Genova, Songyou Peng et al.

ICCV 2025posterarXiv:2503.06271
#1592

Improved Noise Schedule for Diffusion Training

Tiankai Hang, Shuyang Gu, Jianmin Bao et al.

ICCV 2025posterarXiv:2407.03297
#1593

Secure On-Device Video OOD Detection Without Backpropagation

Li Li, Peilin Cai, Yuxiao Zhou et al.

ICCV 2025posterarXiv:2503.06166
#1594

Learning Counterfactually Decoupled Attention for Open-World Model Attribution

Yu Zheng, Boyang Gong, Fanye Kong et al.

ICCV 2025posterarXiv:2506.23074
#1595

Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning

Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.

ICCV 2025posterarXiv:2507.21494
#1596

Is Less More? Exploring Token Condensation as Training-free Test-time Adaptation

Zixin Wang, Dong Gong, Sen Wang et al.

ICCV 2025posterarXiv:2410.14729
#1597

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

Qifan Yu, Zhebei Shen, Zhongqi Yue et al.

ICCV 2025highlightarXiv:2412.06293
#1598

Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations

Chongjie Si, Zhiyi Shi, Xuehui Wang et al.

ICCV 2025posterarXiv:2504.00851
#1599

Test-Time Prompt Tuning for Zero-Shot Depth Completion

Chanhwi Jeong, Inhwan Bae, Jin-Hwi Park et al.

ICCV 2025highlight
#1600

One Encoder to Rule them All: Representation Learning for Model-free Visual Reinforcement Learning using Fourier Neural Operators

Parag Dutta, Mohd Ayyoob, Shalabh Bhatnagar et al.

ICCV 2025poster