Most Cited ICCV "structure-conditioned generation" Papers

2,701 papers found • Page 9 of 14

#1601

Deciphering Cross-Modal Alignment in Large Vision-Language Models via Modality Integration Rate

Qidong Huang, Xiaoyi Dong, Pan Zhang et al.

ICCV 2025poster
#1602

TITAN: Query-Token based Domain Adaptive Adversarial Learning

Tajamul Ashraf, Janibul Bashir

ICCV 2025posterarXiv:2506.21484
#1603

StolenLoRA: Exploring LoRA Extraction Attacks via Synthetic Data

Yixu Wang, Yan Teng, Yingchun Wang et al.

ICCV 2025highlightarXiv:2509.23594
#1604

Doodle Your Keypoints: Sketch-Based Few-Shot Keypoint Detection

Subhajit Maity, Ayan Bhunia, Subhadeep Koley et al.

ICCV 2025posterarXiv:2507.07994
#1605

Partial Forward Blocking: A Novel Data Pruning Paradigm for Lossless Training Acceleration

Dongyue Wu, Zilin Guo, Jialong Zuo et al.

ICCV 2025posterarXiv:2506.23674
#1606

LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding

Amirhossein Kazerouni, Soroush Mehraban, Michael Brudno et al.

ICCV 2025posterarXiv:2503.15420
#1607

ShortV: Efficient Multimodal Large Language Models by Freezing Visual Tokens in Ineffective Layers

Qianhao Yuan, Qingyu Zhang, yanjiang liu et al.

ICCV 2025posterarXiv:2504.00502
#1608

CIARD: Cyclic Iterative Adversarial Robustness Distillation

Liming Lu, Shuchao Pang, Xu Zheng et al.

ICCV 2025posterarXiv:2509.12633
#1609

MOBIUS: Big-to-Mobile Universal Instance Segmentation via Multi-modal Bottleneck Fusion and Calibrated Decoder Pruning

Mattia Segu, Marta Tintore Gazulla, Yongqin Xian et al.

ICCV 2025posterarXiv:2510.15026
#1610

Moderating the Generalization of Score-based Generative Model

Wan Jiang, He Wang, Xin Zhang et al.

ICCV 2025posterarXiv:2412.07229
#1611

LLM-assisted Entropy-based Adaptive Distillation for Unsupervised Fine-grained Visual Representation Learning

Jianfeng Dong, Danfeng Luo, Daizong Liu et al.

ICCV 2025poster
#1612

InfoBridge: Balanced Multimodal Integration through Conditional Dependency Modeling

Chenxin Li, Yifan Liu, Panwang Pan et al.

ICCV 2025poster
#1613

ChartPoint: Guiding MLLMs with Grounding Reflection for Chart Reasoning

Zhengzhuo Xu, Sinan Du, Yiyan Qi et al.

ICCV 2025posterarXiv:2512.00305
#1614

DiffRefine: Diffusion-based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection

Sangyun Shin, Yuhang He, Xinyu Hou et al.

ICCV 2025highlight
#1615

Gradient Short-Circuit: Efficient Out-of-Distribution Detection via Feature Intervention

Jiawei Gu, Ziyue Qiao, Zechao Li

ICCV 2025posterarXiv:2507.01417
#1616

Boundary Probing for Input Privacy Protection When Using LMM Services

Xiaofei Hui, Haoxuan Qu, Ping Hu et al.

ICCV 2025poster
#1617

UPRE: Zero-Shot Domain Adaptation for Object Detection via Unified Prompt and Representation Enhancement

Xiao Zhang, Fei Wei, Yong Wang et al.

ICCV 2025posterarXiv:2507.00721
#1618

Dataset Distillation as Data Compression: A Rate-Utility Perspective

Youneng Bao, Yiping Liu, Zhuo Chen et al.

ICCV 2025posterarXiv:2507.17221
#1619

Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features

Shangbo Wu, Yu-an Tan, Ruinan Ma et al.

ICCV 2025posterarXiv:2506.21046
#1620

Open-set Cross Modal Generalization via Multimodal Unified Representation

Hai Huang, Yan Xia, Shulei Wang et al.

ICCV 2025posterarXiv:2507.14935
#1621

Adversarial Data Augmentation for Single Domain Generalization via Lyapunov Exponent-Guided Optimization

ZUYU ZHANG, Ning Chen, Yongshan Liu et al.

ICCV 2025posterarXiv:2507.04302
#1622

NegRefine: Refining Negative Label-Based Zero-Shot OOD Detection

Amirhossein Ansari, Ke Wang, Pulei Xiong

ICCV 2025posterarXiv:2507.09795
#1623

Divide-and-Conquer for Enhancing Unlabeled Learning, Stability, and Plasticity in Semi-supervised Continual Learning

Yue Duan, Taicai Chen, Lei Qi et al.

ICCV 2025posterarXiv:2508.05316
#1624

A Unified Framework to BRIDGE Complete and Incomplete Deep Multi-View Clustering under Non-IID Missing Patterns

Xiaorui Jiang, Buyun He, Peng Yuan Zhou et al.

ICCV 2025poster
#1625

GCAV: A Global Concept Activation Vector Framework for Cross-Layer Consistency in Interpretability

Zhenghao He, Sanchit Sinha, Guangzhi Xiong et al.

ICCV 2025posterarXiv:2508.21197
#1626

Confound from All Sides, Distill with Resilience: Multi-Objective Adversarial Paths to Zero-Shot Robustness

Junhao Dong, Jiao Liu, Xinghua Qu et al.

ICCV 2025highlight
#1627

Mitigating Object Hallucinations via Sentence-Level Early Intervention

Shangpin Peng, Senqiao Yang, Li Jiang et al.

ICCV 2025posterarXiv:2507.12455
#1628

Active Membership Inference Test (aMINT): Enhancing Model Auditability with Multi-Task Learning.

Daniel DeAlcala, Aythami Morales, Julian Fierrez et al.

ICCV 2025posterarXiv:2509.07879
#1629

Open-Unfairness Adversarial Mitigation for Generalized Deepfake Detection

Zhaoyang Li, Zhu Teng, Baopeng Zhang et al.

ICCV 2025poster
#1630

Spatial Preference Rewarding for MLLMs Spatial Understanding

Han Qiu, Peng Gao, Lewei Lu et al.

ICCV 2025posterarXiv:2510.14374
#1631

Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue

Guohao Sun, Can Qin, Yihao Feng et al.

ICCV 2025poster
#1632

Semi-ViM: Bidirectional State Space Model for Mitigating Label Imbalance in Semi-Supervised Learning

Hongyang He, Hongyang Xie, Haochen You et al.

ICCV 2025poster
#1633

Contact-Aware Refinement of Human Pose Pseudo-Ground Truth via Bioimpedance Sensing

Maria-Paola Forte, Nikos Athanasiou, Giulia Ballardini et al.

ICCV 2025posterarXiv:2512.04862
#1634

Beyond the Limits: Overcoming Negative Correlation of Activation-Based Training-Free NAS

Haidong Kang, Lianbo Ma, Pengjun Chen et al.

ICCV 2025poster
#1635

Integrating Task-Specific and Universal Adapters for Pre-Trained Model-based Class-Incremental Learning

yan wang, Da-Wei Zhou, Han-Jia Ye

ICCV 2025posterarXiv:2508.08165
#1636

Semi-supervised Deep Transfer for Regression without Domain Alignment

Mainak Biswas, Ambedkar Dukkipati, Devarajan Sridharan

ICCV 2025posterarXiv:2509.05092
#1637

From Easy to Hard: The MIR Benchmark for Progressive Interleaved Multi-Image Reasoning

Hang Du, Jiayang Zhang, Guoshun Nan et al.

ICCV 2025posterarXiv:2509.17040
#1638

Diffusion Guided Adaptive Augmentation for Generalization in Visual Reinforcement Learning

Jeong Woon Lee, Hyoseok Hwang

ICCV 2025poster
#1639

A Framework for Double-Blind Federated Adaptation of Foundation Models

Nurbek Tastan, Karthik Nandakumar

ICCV 2025posterarXiv:2502.01289
#1640

EA-Vit: Efficient Adaptation for Elastic Vision Transformer

Chen Zhu, Wangbo Zhao, Huiwen Zhang et al.

ICCV 2025posterarXiv:2507.19360
#1641

Feature Coding in the Era of Large Models: Dataset, Test Conditions, and Benchmark

Changsheng Gao, Yifan Ma, Qiaoxi Chen et al.

ICCV 2025posterarXiv:2412.04307
#1642

MMOne: Representing Multiple Modalities in One Scene

Zhifeng Gu, Bing WANG

ICCV 2025posterarXiv:2507.11129
#1643

MM-IFEngine: Towards Multimodal Instruction Following

Shengyuan Ding, Wu Shenxi, Xiangyu Zhao et al.

ICCV 2025posterarXiv:2504.07957
#1644

RainbowPrompt: Diversity-Enhanced Prompt-Evolving for Continual Learning

Kiseong Hong, Gyeong-Hyeon Kim, Eunwoo Kim

ICCV 2025posterarXiv:2507.22553
#1645

VisionMath: Vision-Form Mathematical Problem-Solving

Zongyang Ma, Yuxin Chen, Ziqi Zhang et al.

ICCV 2025poster
#1646

Dataset Distillation via the Wasserstein Metric

Haoyang Liu, Peiran Wang, Yijiang Li et al.

ICCV 2025posterarXiv:2311.18531
#1647

A Good Teacher Adapts Their Knowledge for Distillation

Chengyao Qian, Trung Le, Mehrtash Harandi

ICCV 2025poster
#1648

Quanta Neural Networks: From Photons to Perception

Varun Sundar, Tianyi Zhang, Sacha Jungerman et al.

ICCV 2025poster
#1649

AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving

Ruifei Zhang, Junlin Xie, Wei Zhang et al.

ICCV 2025posterarXiv:2511.06253
#1650

Depth Any Event Stream: Enhancing Event-based Monocular Depth Estimation via Dense-to-Sparse Distillation

Jinjing Zhu, Tianbo Pan, Zidong Cao et al.

ICCV 2025poster
#1651

Consistent Time-of-Flight Depth Denoising via Graph-Informed Geometric Attention

Weida Wang, Changyong He, Jin Zeng et al.

ICCV 2025posterarXiv:2506.23542
#1652

MPBR: Multimodal Progressive Bidirectional Reasoning for Open-Set Fine-Grained Recognition

Junfu Tan, Peiguang Jing, Yu Zhu et al.

ICCV 2025poster
#1653

MAVias: Mitigate any Visual Bias

Ioannis Sarridis, Christos Koutlis, Symeon Papadopoulos et al.

ICCV 2025posterarXiv:2412.06632
#1654

OpenSubstance: A High-quality Measured Dataset of Multi-View and -Lighting Images and Shapes

Fan Pei, jinchen bai, Xiang Feng et al.

ICCV 2025poster
#1655

VGMamba: Attribute-to-Location Clue Reasoning for Quantity-Agnostic 3D Visual Grounding

Zhu Yihang, Jinhao Zhang, Yuxuan Wang et al.

ICCV 2025poster
#1656

AnnofreeOD: Detecting All Classes at Low Frame Rates Without Human Annotations

Boyi Sun, Yuhang Liu, Houxin He et al.

ICCV 2025poster
#1657

TWIST & SCOUT: Grounding Multimodal LLM-Experts by Forget-Free Tuning

Aritra Bhowmik, Mohammad Mahdi Derakhshani, Dennis Koelma et al.

ICCV 2025posterarXiv:2410.10491
#1658

Controlling Multimodal LLMs via Reward-guided Decoding

Oscar Mañas, Pierluca D'Oro, Koustuv Sinha et al.

ICCV 2025posterarXiv:2508.11616
#1659

CE-FAM: Concept-Based Explanation via Fusion of Activation Maps

Michihiro Kuroki, Toshihiko Yamasaki

ICCV 2025posterarXiv:2509.23849
#1660

PEFTDiff: Diffusion-Guided Transferability Estimation for Parameter-Efficient Fine-Tuning

PRAFFUL KHOBA, Zijian Wang, Chetan Arora et al.

ICCV 2025poster
#1661

RMultiplex200K: Toward Reliable Multimodal Process Supervision for Visual Language Models on Telecommunications

Sijia Chen, Bin Song

ICCV 2025poster
#1662

Class-Wise Federated Averaging for Efficient Personalization

Gyuejeong Lee, Daeyoung Choi

ICCV 2025posterarXiv:2406.07800
#1663

Towards Privacy-preserved Pre-training of Remote Sensing Foundation Models with Federated Mutual-guidance Learning

Jieyi Tan, Chengwei Zhang, Bo Dang et al.

ICCV 2025posterarXiv:2503.11051
#1664

Multi-view Gaze Target Estimation

Qiaomu Miao, Vivek Golani, Jingyi Xu et al.

ICCV 2025posterarXiv:2508.05857
#1665

EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients

meihan wu, Tao Chang, Cui Miao et al.

ICCV 2025posterarXiv:2412.00334
#1666

Target Bias Is All You Need: Zero-Shot Debiasing of Vision-Language Models with Bias Corpus

Taeuk Jang, Hoin Jung, Xiaoqian Wang

ICCV 2025poster
#1667

Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs

Zitian Wang, Yue Liao, RONG KANG et al.

ICCV 2025posterarXiv:2503.20309
#1668

Large Learning Rates Simultaneously Achieve Robustness to Spurious Correlations and Compressibility

Melih Barsbey, Lucas Prieto, Stefanos Zafeiriou et al.

ICCV 2025posterarXiv:2507.17748
#1669

Dynamic Multi-Layer Null Space Projection for Vision-Language Continual Learning

Borui Kang, Lei Wang, Zhiping Wu et al.

ICCV 2025poster
#1670

FedMeNF: Privacy-Preserving Federated Meta-Learning for Neural Fields

Junhyeog Yun, Minui Hong, Gunhee Kim

ICCV 2025posterarXiv:2508.06301
#1671

Prototype Guided Backdoor Defense via Activation Space Manipulation

Venkat Adithya Amula, Sunayana Samavedam, Saurabh Saini et al.

ICCV 2025poster
#1672

RIPE: Reinforcement Learning on Unlabeled Image Pairs for Robust Keypoint Extraction

Johannes Künzel, Anna Hilsmann, Peter Eisert

ICCV 2025posterarXiv:2507.04839
#1673

Analyzing Finetuning Representation Shift for Multimodal LLMs Steering

Pegah KHAYATAN, Mustafa Shukor, Jayneel Parekh et al.

ICCV 2025posterarXiv:2501.03012
#1674

Multi-Cache Enhanced Prototype Learning for Test-Time Generalization of Vision-Language Models

Xinyu Chen, Haotian Zhai, Can Zhang et al.

ICCV 2025posterarXiv:2508.01225
#1675

AVAM: a Universal Training-free Adaptive Visual Anchoring Embedded into Multimodal Large Language Model for Multi-image Question Answering

Kang Zeng, Guojin Zhong, Jintao Cheng et al.

ICCV 2025poster
#1676

Unsupervised Visual Chain-of-Thought Reasoning via Preference Optimization

Kesen Zhao, Beier Zhu, Qianru Sun et al.

ICCV 2025posterarXiv:2504.18397
#1677

TRNAS: A Training-Free Robust Neural Architecture Search

Yeming Yang, Qingling Zhu, Jianping Luo et al.

ICCV 2025poster
#1678

Staining and Locking Computer Vision Models Without Retraining

Oliver Sutton, Qinghua Zhou, George Leete et al.

ICCV 2025posterarXiv:2507.22000
#1679

The Inter-Intra Modal Measure: A Predictive Lens on Fine-Tuning Outcomes in Vision-Language Models

Laura Niss, Kevin Vogt-Lowell, Theodoros Tsiligkaridis

ICCV 2025posterarXiv:2407.15731
#1680

What to Distill? Fast Knowledge Distillation with Adaptive Sampling

Byungchul Chae, Seonyeong Heo

ICCV 2025highlight
#1681

Flexi-FSCIL: Adaptive Knowledge Retention for Breaking the Stability-Plasticity Dilemma in Few-Shot Class-Incremental Learning

Wufei Xie, Yalin Wang, Chenliang Liu et al.

ICCV 2025poster
#1682

Multispectral Demosaicing via Dual Cameras

SaiKiran Tedla, Junyong Lee, Beixuan Yang et al.

ICCV 2025highlightarXiv:2503.22026
#1683

Generative Modeling of Shape-Dependent Self-Contact Human Poses

Takehiko Ohkawa, Jihyun Lee, Shunsuke Saito et al.

ICCV 2025posterarXiv:2509.23393
#1684

Met2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems

Shaohan Li, Hao Yang, Min Chen et al.

ICCV 2025poster
#1685

Beyond RGB: Adaptive Parallel Processing for RAW Object Detection

Shani Gamrian, Hila Barel, Feiran Li et al.

ICCV 2025posterarXiv:2503.13163
#1686

TorchAdapt: Towards Light-Agnostic Real-Time Visual Perception

Khurram Azeem Hashmi, Karthik Suresh, Didier Stricker et al.

ICCV 2025poster
#1687

Human-in-the-Loop Local Corrections of 3D Scene Layouts via Infilling

Christopher Xie, Armen Avetisyan, Henry Howard-Jenkins et al.

ICCV 2025highlightarXiv:2503.11806
#1688

POMATO: Marrying Pointmap Matching with Temporal Motions for Dynamic 3D Reconstruction

Songyan Zhang, Yongtao Ge, Jinyuan Tian et al.

ICCV 2025posterarXiv:2504.05692
#1689

Boosting Class Representation via Semantically Related Instances for Robust Long-Tailed Learning with Noisy Labels

Yuhang Li, Zhuying Li, Yuheng Jia

ICCV 2025poster
#1690

CAT: A Unified Click-and-Track Framework for Realistic Tracking

Yongsheng Yuan, Jie Zhao, Dong Wang et al.

ICCV 2025poster
#1691

Diffusion-Based Extreme High-speed Scenes Reconstruction with the Complementary Vision Sensor

Yapeng Meng, Yihan Lin, Taoyi Wang et al.

ICCV 2025poster
#1692

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.

ICCV 2025posterarXiv:2507.22825
#1693

Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design

Yuhao Sun, Yihua Zhang, Gaowen Liu et al.

ICCV 2025posterarXiv:2508.10065
#1694

DiffuMatch: Category-Agnostic Spectral Diffusion Priors for Robust Non-rigid Shape Matching

Emery Pierson, Lei Li, Angela Dai et al.

ICCV 2025posterarXiv:2507.23715
#1695

SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity

Valter Piedade, Chitturi Sidhartha, José Gaspar et al.

ICCV 2025highlight
#1696

Real3D: Towards Scaling Large Reconstruction Models with Real Images

Hanwen Jiang, Qixing Huang, Georgios Pavlakos

ICCV 2025poster
#1697

Do It Yourself: Learning Semantic Correspondence from Pseudo-Labels

Olaf Dünkel, Thomas Wimmer, Christian Theobalt et al.

ICCV 2025posterarXiv:2506.05312
#1698

Stochastic Interpolants for Revealing Stylistic Flows across the History of Art

Pingchuan Ma, Ming Gui, Johannes Schusterbauer et al.

ICCV 2025poster
#1699

Is Tracking really more challenging in First Person Egocentric Vision?

Matteo Dunnhofer, Zaira Manigrasso, Christian Micheloni

ICCV 2025highlightarXiv:2507.16015
#1700

GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration

Li Mi, Manon Béchaz, Zeming Chen et al.

ICCV 2025posterarXiv:2508.00152
#1701

Learning Large Motion Estimation from Intermediate Representations with a High-Resolution Optical Flow Dataset Featuring Long-Range Dynamic Motion

Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon

ICCV 2025highlight
#1702

CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy

Dongyoung Kim, Mahmoud Afifi, Dongyun Kim et al.

ICCV 2025posterarXiv:2504.07959
#1703

MGSfM: Multi-Camera Geometry Driven Global Structure-from-Motion

peilin Tao, Hainan Cui, Diantao Tu et al.

ICCV 2025posterarXiv:2507.03306
#1704

Zero-shot Inexact CAD Model Alignment from a Single Image

Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.

ICCV 2025posterarXiv:2507.03292
#1705

Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer

Hai Wu, Hongwei Lin, Xusheng Guo et al.

ICCV 2025poster
#1706

Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension

Xiyao Wang, Zhengyuan Yang, Linjie Li et al.

ICCV 2025posterarXiv:2412.03704
#1707

Manual-PA: Learning 3D Part Assembly from Instruction Diagrams

Jiahao Zhang, Anoop Cherian, Cristian Rodriguez-Opazo et al.

ICCV 2025posterarXiv:2411.18011
#1708

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation

Pingrui Zhang, Xianqiang Gao, Yuhan Wu et al.

ICCV 2025posterarXiv:2503.11081
#1709

NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation

Peiran Xu, Xicheng Gong, Yadong Mu

ICCV 2025posterarXiv:2510.16457
#1710

GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation

Phillip Mueller, Talip Ünlü, Sebastian Schmidt et al.

ICCV 2025posterarXiv:2510.22337
#1711

OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection

Heng Su, Mengying Xie, Nieqing Cao et al.

ICCV 2025poster
#1712

Arti-PG: A Toolbox for Procedurally Synthesizing Large-Scale and Diverse Articulated Objects with Rich Annotations

Jianhua Sun, Yuxuan Li, Jiude Wei et al.

ICCV 2025posterarXiv:2412.14974
#1713

Scaling 3D Compositional Models for Robust Classification and Pose Estimation

Xiaoding Yuan, Prakhar Kaushik, Guofeng Zhang et al.

ICCV 2025poster
#1714

X-Capture: An Open-Source Portable Device for Multi-Sensory Learning

Samuel Clarke, Suzannah Wistreich, Yanjie Ze et al.

ICCV 2025posterarXiv:2504.02318
#1715

DRaM-LHM: A Quaternion Framework for Iterative Camera Pose Estimation

Chen Lin, Weizhi Du, Zhixiang Min et al.

ICCV 2025poster
#1716

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

Pengfei Ren, Jingyu Wang, Haifeng Sun et al.

ICCV 2025poster
#1717

Epipolar Consistent Attention Aggregation Network for Unsupervised Light Field Disparity Estimation

Chen Gao, Shuo Zhang, Youfang Lin

ICCV 2025poster
#1718

GloPER: Unsupervised Animal Pattern Extraction from Local Reconstruction

Bowen Chen, Yun Sing Koh, Gillian Dobbie

ICCV 2025poster
#1719

Focal Plane Visual Feature Generation and Matching on a Pixel Processor Array

Hongyi Zhang, Laurie Bose, Jianing Chen et al.

ICCV 2025poster
#1720

Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation

Hongyu Wen, Yiming Zuo, Venkat Subramanian et al.

ICCV 2025posterarXiv:2503.11633
#1721

SpatialTrackerV2: Advancing 3D Point Tracking with Explicit Camera Motion

Yuxi Xiao, Jianyuan Wang, Nan Xue et al.

ICCV 2025poster
#1722

A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks

Qi Bi, Jingjun Yi, Huimin Huang et al.

ICCV 2025poster
#1723

IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation

Wenxuan Guo, Xiuwei Xu, Hang Yin et al.

ICCV 2025posterarXiv:2508.00823
#1724

AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning

Dejie Yang, Zijing Zhao, Yang Liu

ICCV 2025posterarXiv:2508.07626
#1725

Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection

Jae Young Kang, Hoonhee Cho, Kuk-Jin Yoon

ICCV 2025posterarXiv:2508.02288
#1726

PlaneRAS: Learning Planar Primitives for 3D Plane Recovery

Fang Zhang, Wenzhao Zheng, Linqing Zhao et al.

ICCV 2025poster
#1727

3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark

Wufei Ma, Haoyu Chen, Guofeng Zhang et al.

ICCV 2025posterarXiv:2412.07825
#1728

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Xuying Zhang, Yutong Liu, Yangguang Li et al.

ICCV 2025posterarXiv:2412.16919
#1729

Exploring the Adversarial Vulnerabilities of Vision-Language-Action Models in Robotics

Taowen Wang, Cheng Han, James Liang et al.

ICCV 2025posterarXiv:2411.13587
#1730

Simultaneous Motion And Noise Estimation with Event Cameras

Shintaro Shiba, Yoshimitsu Aoki, Guillermo Gallego

ICCV 2025posterarXiv:2504.04029
#1731

Weakly-Supervised Learning of Dense Functional Correspondences

Stefan Stojanov, Linan Zhao, Yunzhi Zhang et al.

ICCV 2025posterarXiv:2509.03893
#1732

Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs

Xuange Zhang, Dengjie Li, Bo Liu et al.

ICCV 2025poster
#1733

StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth

Zheng Zhang, Lihe Yang, Tianyu Yang et al.

ICCV 2025highlight
#1734

4DSegStreamer: Streaming 4D Panoptic Segmentation via Dual Threads

Ling Liu, Jun Tian, Li Yi

ICCV 2025posterarXiv:2510.17664
#1735

HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation

Yulin Wang, Mengting Hu, Hongli Li et al.

ICCV 2025highlightarXiv:2510.10177
#1736

GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting

Andrew Bond, Jui-Hsien Wang, Long Mai et al.

ICCV 2025posterarXiv:2501.04782
#1737

CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers

Dimitrios Mallis, Ahmet Karadeniz, Sebastian Cavada et al.

ICCV 2025posterarXiv:2412.13810
#1738

Exploring View Consistency for Scene-Adaptive Low-Light Light Field Image Enhancement

Shuo Zhang, Chen Gao, Youfang Lin

ICCV 2025highlight
#1739

VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions

Yash Garg, Saketh Bachu, Arindam Dutta et al.

ICCV 2025posterarXiv:2508.06757
#1740

Tracking Tiny Drones against Clutter: Large-Scale Infrared Benchmark with Motion-Centric Adaptive Algorithm

Jiahao Zhang, Zongli Jiang, Gang Wang et al.

ICCV 2025poster
#1741

MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs

Erik Daxberger, Nina Wenzel, David Griffiths et al.

ICCV 2025posterarXiv:2503.13111
#1742

AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs

Yi-Ting Shen, Sungmin Eum, Doheon Lee et al.

ICCV 2025posterarXiv:2503.22884
#1743

Understanding Flatness in Generative Models: Its Role and Benefits

Taehwan Lee, Kyeongkook Seo, Jaejun Yoo et al.

ICCV 2025posterarXiv:2503.11078
#1744

Image-Guided Shape-from-Template Using Mesh Inextensibility Constraints

Dinh-Vinh-Thuy Tran, Ruochen Chen, Shaifali Parashar

ICCV 2025posterarXiv:2507.22699
#1745

PHD: Personalized 3D Human Body Fitting with Point Diffusion

Hsuan-I Ho, Chen Guo, Po-Chen Wu et al.

ICCV 2025posterarXiv:2508.21257
#1746

Frequency Domain-Based Diffusion Model for Unpaired Image Dehazing

Chengxu Liu, Lu Qi, Jinshan Pan et al.

ICCV 2025posterarXiv:2507.01275
#1747

ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion

AO LI, Jinpeng Liu, Yixuan Zhu et al.

ICCV 2025posterarXiv:2509.07920
#1748

MonoSOWA: Scalable monocular 3D Object detector Without human Annotations

Jan Skvrna, Lukas Neumann

ICCV 2025posterarXiv:2501.09481
#1749

Estimating 2D Camera Motion with Hybrid Motion Basis

Haipeng Li, Tianhao Zhou, Zhanglei Yang et al.

ICCV 2025posterarXiv:2507.22480
#1750

H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction

Heng Jia, Na Zhao, Linchao Zhu

ICCV 2025posterarXiv:2508.03118
#1751

From Abyssal Darkness to Blinding Glare: A Benchmark on Extreme Exposure Correction in Real World

Bo Wang, Huiyuan Fu, Zhiye Huang et al.

ICCV 2025poster
#1752

TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras

Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski

ICCV 2025posterarXiv:2508.00913
#1753

Find Any Part in 3D

Ziqi Ma, Yisong Yue, Georgia Gkioxari

ICCV 2025highlightarXiv:2411.13550
#1754

Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion

Junru Lin, Chirag Vashist, Mikaela Uy et al.

ICCV 2025posterarXiv:2508.20136
#1755

SpikeDiff: Zero-shot High-Quality Video Reconstruction from Chromatic Spike Camera and Sub-millisecond Spike Streams

Siqi Yang, Jinxiu Liang, Zhaojun Huang et al.

ICCV 2025poster
#1756

AJAHR: Amputated Joint Aware 3D Human Mesh Recovery

hyunjin cho, Giyun choi, Jongwon Choi

ICCV 2025posterarXiv:2509.19939
#1757

EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks

Athinoulla Konstantinou, Georgios Leontidis, Mamatha Thota et al.

ICCV 2025posterarXiv:2506.09895
#1758

Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras

Shuang Guo, Friedhelm Hamann, Guillermo Gallego

ICCV 2025highlightarXiv:2503.17262
#1759

6DOPE-GS: Online 6D Object Pose Estimation using Gaussian Splatting

Yufeng Jin, Vignesh Prasad, Snehal Jauhri et al.

ICCV 2025posterarXiv:2412.01543
#1760

Background Invariance Testing According to Semantic Proximity

Zukang Liao, Min Chen

ICCV 2025posterarXiv:2208.09286
#1761

One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images

Byeongjun Kwon, Munchurl Kim

ICCV 2025posterarXiv:2503.22351
#1762

Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision

Xiao Fang, Minhyek Jeon, Zheyang Qin et al.

ICCV 2025posterarXiv:2507.20976
#1763

RegGS: Unposed Sparse Views Gaussian Splatting with 3DGS Registration

Chong Cheng, Yu Hu, Sicheng Yu et al.

ICCV 2025posterarXiv:2507.08136
#1764

CObL: Toward Zero-Shot Ordinal Layering without User Prompting

Aneel Damaraju, Dean Hazineh, Todd Zickler

ICCV 2025highlightarXiv:2508.08498
#1765

Hierarchical Material Recognition from Local Appearance

Matthew Beveridge, Shree Nayar

ICCV 2025highlightarXiv:2505.22911
#1766

TopicGeo: An Efficient Unified Framework for Geolocation

Xin Wang, Xinlin Wang, Shuiping Gou

ICCV 2025poster
#1767

Revisiting Image Fusion for Multi-Illuminant White-Balance Correction

David Serrano, Aditya Arora, Luis Herranz et al.

ICCV 2025posterarXiv:2503.14774
#1768

Partially Matching Submap Helps: Uncetainty Modeling and Propagation for Text to Point Cloud Localization

Mingtao Feng, Longlong Mei, Zijie Wu et al.

ICCV 2025poster
#1769

Medical World Model

Yijun Yang, Zhao-Yang Wang, Qiuping Liu et al.

ICCV 2025poster
#1770

MaskHand: Generative Masked Modeling for Robust Hand Mesh Reconstruction in the Wild

Muhammad Usama Saleem, Ekkasit Pinyoanuntapong, Mayur Patel et al.

ICCV 2025posterarXiv:2412.13393
#1771

Uncertainty-Aware Gradient Stabilization for Small Object Detection

Huixin Sun, Yanjing Li, Linlin Yang et al.

ICCV 2025posterarXiv:2303.01803
#1772

CryoFastAR: Fast Cryo-EM Ab initio Reconstruction Made Easy

Jiakai Zhang, Shouchen Zhou, Haizhao Dai et al.

ICCV 2025posterarXiv:2506.05864
#1773

Beyond Pixel Uncertainty: Bounding the OoD Objects in Road Scenes

Huachao Zhu, Zelong Liu, Zhichao Sun et al.

ICCV 2025poster
#1774

Event-guided Unified Framework for Low-light Video Enhancement, Frame Interpolation, and Deblurring

Taewoo Kim, Kuk-Jin Yoon

ICCV 2025poster
#1775

PS-Mamba: Spatial-Temporal Graph Mamba for Pose Sequence Refinement

Haoye Dong, Gim Hee Lee

ICCV 2025poster
#1776

Spatial Alignment and Temporal Matching Adapter for Video-Radar Remote Physiological Measurement

Qian Liang, Ruixu Geng, Jinbo Chen et al.

ICCV 2025poster
#1777

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Yusuke Hirota, Ryo Hachiuma, Boyi Li et al.

ICCV 2025posterarXiv:2509.07596
#1778

AGO: Adaptive Grounding for Open World 3D Occupancy Prediction

Peizheng Li, Shuxiao Ding, You Zhou et al.

ICCV 2025posterarXiv:2504.10117
#1779

Environment-Agnostic Pose: Generating Environment-independent Object Representations for 6D Pose Estimation

Shaobo Zhang, Yuhang Huang, Wanqing Zhao et al.

ICCV 2025poster
#1780

Online Dense Point Tracking with Streaming Memory

Qiaole Dong, Yanwei Fu

ICCV 2025posterarXiv:2503.06471
#1781

MaGS: Reconstructing and Simulating Dynamic 3D Objects with Mesh-adsorbed Gaussian Splatting

Shaojie Ma, Yawei Luo, Wei Yang et al.

ICCV 2025highlightarXiv:2406.01593
#1782

CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector

Abhinav Kumar, Yuliang Guo, Zhihao Zhang et al.

ICCV 2025posterarXiv:2508.11185
#1783

Test-Time Retrieval-Augmented Adaptation for Vision-Language Models

Xinqi Fan, Xueli CHEN, Luoxiao Yang et al.

ICCV 2025poster
#1784

RnGCam: High-speed video from rolling & global shutter measurements

Kevin Tandi, Xiang Dai, Chinmay Talegaonkar et al.

ICCV 2025posterarXiv:2509.18087
#1785

Self-Supervised Monocular 4D Scene Reconstruction for Egocentric Videos

Chengbo Yuan, Geng Chen, Li Yi et al.

ICCV 2025posterarXiv:2411.09145
#1786

MixRI: Mixing Features of Reference Images for Novel Object Pose Estimation

Xinhang Liu, Jiawei Shi, Zheng Dang et al.

ICCV 2025posterarXiv:2601.06883
#1787

ReassembleNet: Learnable Keypoints and Diffusion for 2D Fresco Reconstruction

ADEELA ISLAM, Stefano Fiorini, Stuart James et al.

ICCV 2025posterarXiv:2505.21117
#1788

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Zizhang Li, Hong-Xing Yu, Wei Liu et al.

ICCV 2025highlightarXiv:2505.18151
#1789

Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

Kaixuan Jiang, Yang Liu, Weixing Chen et al.

ICCV 2025posterarXiv:2503.11117
#1790

Not all Views are Created Equal: Analyzing Viewpoint Instabilities in Vision Foundation Models

Mateusz Michalkiewicz, Xinyue Bai, Mahsa Baktashmotlagh et al.

ICCV 2025posterarXiv:2412.19920
#1791

CHROME: Clothed Human Reconstruction with Occlusion-Resilience and Multiview-Consistency from a Single Image

Arindam Dutta, Meng Zheng, Zhongpai Gao et al.

ICCV 2025highlightarXiv:2503.15671
#1792

ReCoT: Reflective Self-Correction Training for Mitigating Confirmation Bias in Large Vision-Language Models

Mengxue Qu, Yibo Hu, Kunyang Han et al.

ICCV 2025poster
#1793

GenHaze: Pioneering Controllable One-Step Realistic Haze Generation for Real-World Dehazing

Sixiang Chen, Tian Ye, Yunlong Lin et al.

ICCV 2025poster
#1794

OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration

Yiming Zuo, Willow Yang, Zeyu Ma et al.

ICCV 2025posterarXiv:2411.19278
#1795

GECO: Geometrically Consistent Embedding with Lightspeed Inference

Regine Hartwig, Dominik Muhle, Riccardo Marin et al.

ICCV 2025posterarXiv:2508.00746
#1796

Dream-to-Recon: Monocular 3D Reconstruction with Diffusion-Depth Distillation from Single Images

Philipp Wulff, Felix Wimbauer, Dominik Muhle et al.

ICCV 2025posterarXiv:2508.02323
#1797

LocalDyGS: Multi-view Global Dynamic Scene Modeling via Adaptive Local Implicit Feature Decoupling

Jiahao Wu, Rui Peng, Jianbo Jiao et al.

ICCV 2025posterarXiv:2507.02363
#1798

Combinative Matching for Geometric Shape Assembly

Nahyuk Lee, Juhong Min, Junhong Lee et al.

ICCV 2025highlightarXiv:2508.09780
#1799

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

Yihan Cao, Jiazhao Zhang, Zhinan Yu et al.

ICCV 2025posterarXiv:2412.10439
#1800

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

Artem Zholus, Carl Doersch, Yi Yang et al.

ICCV 2025posterarXiv:2504.05579