Most Cited ICCV "training trajectory" Papers

2,701 papers found • Page 9 of 14

#1601

UniGS: Modeling Unitary 3D Gaussians for Novel View Synthesis from Sparse-view Images

Jiamin WU, Kenkun Liu, Xiaoke Jiang et al.

ICCV 2025arXiv:2410.13195
1
citations
#1602

UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields

Fabian Perez, Sara Rojas Martinez, Carlos Hinojosa et al.

ICCV 2025arXiv:2506.21884
1
citations
#1603

Efficient Spiking Point Mamba for Point Cloud Analysis

Peixi Wu, Bosong Chai, Menghua Zheng et al.

ICCV 2025arXiv:2504.14371
1
citations
#1604

Visual Surface Wave Elastography: Revealing Subsurface Physical Properties via Visible Surface Waves

Alexander Ogren, Berthy Feng, Jihoon Ahn et al.

ICCV 2025arXiv:2507.09207
1
citations
#1605

PolarAnything: Diffusion-based Polarimetric Image Synthesis

Kailong Zhang, Youwei Lyu, Heng Guo et al.

ICCV 2025highlightarXiv:2507.17268
1
citations
#1606

MergeOcc: Bridge the Domain Gap between Different LiDARs for Robust Occupancy Prediction

Zikun Xu, Shaobing Xu

ICCV 2025arXiv:2403.08512
1
citations
#1607

LoD-Loc v2: Aerial Visual Localization over Low Level-of-Detail City Models using Explicit Silhouette Alignment

Juelin Zhu, Shuaibang Peng, Long Wang et al.

ICCV 2025arXiv:2507.00659
1
citations
#1608

LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation

WEI-JER Chang, Masayoshi Tomizuka, Wei Zhan et al.

ICCV 2025arXiv:2504.11521
1
citations
#1609

ACE-G: Improving Generalization of Scene Coordinate Regression Through Query Pre-Training

Leonard Bruns, Axel Barroso-Laguna, Tommaso Cavallari et al.

ICCV 2025arXiv:2510.11605
1
citations
#1610

Adversarial Exploitation of Data Diversity Improves Visual Localization

Sihang Li, Siqi Tan, Bowen Chang et al.

ICCV 2025arXiv:2412.00138
1
citations
#1611

AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion

Liuyue Xie, Jiancong Guo, Ozan Cakmakci et al.

ICCV 2025arXiv:2503.21581
1
citations
#1612

SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching

Xiangzeng Liu, CHI WANG, Guanglu Shi et al.

ICCV 2025highlightarXiv:2508.02278
1
citations
#1613

Egocentric Action-aware Inertial Localization in Point Clouds with Vision-Language Guidance

Mingfang Zhang, Ryo Yonetani, Yifei Huang et al.

ICCV 2025arXiv:2505.14346
1
citations
#1614

Coordinate-based Speed of Sound Recovery for Aberration-Corrected Photoacoustic Computed Tomography

Tianao Li, Manxiu Cui, Cheng Ma et al.

ICCV 2025arXiv:2409.10876
1
citations
#1615

Purge-Gate: Efficient Backpropagation-Free Test-Time Adaptation for Point Clouds via Token purging

Moslem Yazdanpanah, Ali Bahri, Mehrdad Noori et al.

ICCV 2025
1
citations
#1616

CF3: Compact and Fast 3D Feature Fields

Hyunjoon Lee, Joonkyu Min, Jaesik Park

ICCV 2025arXiv:2508.05254
1
citations
#1617

ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration

Andrea Conti, Matteo Poggi, Valerio Cambareri et al.

ICCV 2025arXiv:2504.16545
1
citations
#1618

Unsupervised Imaging Inverse Problems with Diffusion Distribution Matching

Giacomo Meanti, Thomas Ryckeboer, Michael Arbel et al.

ICCV 2025arXiv:2506.14605
1
citations
#1619

Perspective-aware 3D Gaussian Inpainting with Multi-view Consistency

Yuxin CHENG, Binxiao Huang, Taiqiang Wu et al.

ICCV 2025arXiv:2510.10993
1
citations
#1620

Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning

Giwon Lee, Wooseong Jeong, Daehee Park et al.

ICCV 2025highlightarXiv:2507.04790
1
citations
#1621

DONUT: A Decoder-Only Model for Trajectory Prediction

Markus Knoche, Daan de Geus, Bastian Leibe

ICCV 2025arXiv:2506.06854
1
citations
#1622

PanoSplatt3R: Leveraging Perspective Pretraining for Generalized Unposed Wide-Baseline Panorama Reconstruction

Jiahui Ren, Mochu Xiang, Jiajun Zhu et al.

ICCV 2025arXiv:2507.21960
1
citations
#1623

Variance-Based Pruning for Accelerating and Compressing Trained Networks

Uranik Berisha, Jens Mehnert, Alexandru Condurache

ICCV 2025arXiv:2507.12988
1
citations
#1624

Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)

Lennart Bastian, Mohammad Rashed, Nassir Navab et al.

ICCV 2025arXiv:2508.07775
1
citations
#1625

Certifiably Optimal Anisotropic Rotation Averaging

Carl Olsson, Yaroslava Lochman, Johan Malmport et al.

ICCV 2025arXiv:2503.07353
1
citations
#1626

MIORe & VAR-MIORe: Benchmarks to Push the Boundaries of Restoration

George Ciubotariu, Zhuyun Zhou, Zongwei Wu et al.

ICCV 2025arXiv:2509.06803
1
citations
#1627

E-SAM: Training-Free Segment Every Entity Model

WEIMING ZHANG, Dingwen Xiao, Lei Chen et al.

ICCV 2025arXiv:2503.12094
1
citations
#1628

Towards Foundational Models for Single-Chip Radar

Tianshu Huang, Akarsh Prabhakara, Chuhan Chen et al.

ICCV 2025arXiv:2509.12482
1
citations
#1629

Understanding Museum Exhibits using Vision-Language Reasoning

Ada-Astrid Balauca, Sanjana Garai, Stefan Balauca et al.

ICCV 2025arXiv:2412.01370
1
citations
#1630

Gradient Extrapolation for Debiased Representation Learning

Ihab Asaad, Maha Shadaydeh, Joachim Denzler

ICCV 2025arXiv:2503.13236
1
citations
#1631

InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow

Yiming Gong, Zhen Zhu, Minjia Zhang

ICCV 2025arXiv:2508.06033
1
citations
#1632

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

Hyung Kyu Kim, Sangmin Lee, HAK GU KIM

ICCV 2025arXiv:2507.20562
1
citations
#1633

PARTE: Part-Guided Texturing for 3D Human Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Gyeongsik Moon et al.

ICCV 2025arXiv:2507.17332
1
citations
#1634

LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching

Meng Tian, Shuo Yang, Xinxiao Wu

ICCV 2025arXiv:2506.23502
1
citations
#1635

ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer

Jin Hu, Mingjia Li, Xiaojie Guo

ICCV 2025arXiv:2412.02545
1
citations
#1636

Easy3D: A Simple Yet Effective Method for 3D Interactive Segmentation

Andrea Simonelli, Norman Müller, Peter Kontschieder

ICCV 2025arXiv:2504.11024
1
citations
#1637

GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields

Shunsuke Yasuki, Taiki Miyanishi, Nakamasa Inoue et al.

ICCV 2025arXiv:2506.23352
1
citations
#1638

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation

Luca Bartolomei, Enrico Mannocci, Fabio Tosi et al.

ICCV 2025arXiv:2509.15224
1
citations
#1639

WIPES: Wavelet-based Visual Primitives

Wenhao Zhang, Hao Zhu, Delong Wu et al.

ICCV 2025arXiv:2508.12615
1
citations
#1640

Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation

Guopeng Li, Qiang Wang, Ke Yan et al.

ICCV 2025arXiv:2410.12342
1
citations
#1641

Memory-Efficient Generative Models via Product Quantization

Jie Shao, Hanxiao Zhang, Hao Yu et al.

ICCV 2025
1
citations
#1642

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation

Jiahua Dong, Hui Yin, Wenqi Liang et al.

ICCV 2025arXiv:2508.08612
1
citations
#1643

RAGD: Regional-Aware Diffusion Model for Text-to-Image Generation

Chen Zhennan, Yajie Li, Haofan Wang et al.

ICCV 2025
1
citations
#1644

Domain Generalizable Portrait Style Transfer

Xinbo Wang, Wenju Xu, Qing Zhang et al.

ICCV 2025arXiv:2507.04243
1
citations
#1645

VQ-SGen: A Vector Quantized Stroke Representation for Creative Sketch Generation

Jiawei Wang, Zhiming Cui, Changjian Li

ICCV 2025arXiv:2411.16446
1
citations
#1646

G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion

Mengdi Liu, Zhangyang Gao, Hong Chang et al.

ICCV 2025arXiv:2502.04684
1
citations
#1647

Task-Specific Zero-shot Quantization-Aware Training for Object Detection

Changhao Li, Xinrui Chen, Ji Wang et al.

ICCV 2025arXiv:2507.16782
1
citations
#1648

DexH2R: A Benchmark for Dynamic Dexterous Grasping in Human-to-Robot Handover

Youzhuo Wang, jiayi ye, Chuyang Xiao et al.

ICCV 2025arXiv:2506.23152
1
citations
#1649

Latent Expression Generation for Referring Image Segmentation and Grounding

Seonghoon Yu, Junbeom Hong, Joonseok Lee et al.

ICCV 2025arXiv:2508.05123
1
citations
#1650

BASIC: Boosting Visual Alignment with Intrinsic Refined Embeddings in Multimodal Large Language Models

Jianting Tang, Yubo Wang, Haoyu Cao et al.

ICCV 2025arXiv:2508.06895
1
citations
#1651

Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework

Yi-Ting Chen, Ting-Hsuan Liao, Pengsheng Guo et al.

ICCV 2025arXiv:2508.04090
1
citations
#1652

Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation

Shengfang ZHAI, Jiajun Li, Yue Liu et al.

ICCV 2025highlightarXiv:2503.06453
1
citations
#1653

MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning

Tianhong Gao, Yannian Fu, Weiqun Wu et al.

ICCV 2025arXiv:2507.21924
1
citations
#1654

SALAD -- Semantics-Aware Logical Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ICCV 2025arXiv:2509.02101
1
citations
#1655

Visual Relation Diffusion for Human-Object Interaction Detection

Ping Cao, Yepeng Tang, Chunjie Zhang et al.

ICCV 2025
1
citations
#1656

WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction

Richard Liu, Daniel Fu, Noah Tan et al.

ICCV 2025arXiv:2505.04813
1
citations
#1657

GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation

Ye Tao, jiawei zhang, Yahao Shi et al.

ICCV 2025arXiv:2503.06136
1
citations
#1658

METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models

Yuchen Liu, Yaoming Wang, Bowen Shi et al.

ICCV 2025arXiv:2507.20842
1
citations
#1659

PixelStitch: Structure-Preserving Pixel-Wise Bidirectional Warps for Unsupervised Image Stitching

Hengzhe Jin, Lang Nie, Chunyu Lin et al.

ICCV 2025
1
citations
#1660

S$^3$E: Self-Supervised State Estimation for Radar-Inertial System

Shengpeng Wang, Yulong Xie, Qing Liao et al.

ICCV 2025arXiv:2509.25984
1
citations
#1661

Video Color Grading via Look-Up Table Generation

Seunghyun Shin, Dongmin Shin, Jisu Shin et al.

ICCV 2025arXiv:2508.00548
1
citations
#1662

Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting

Yuekun Dai, Haitian Li, Shangchen Zhou et al.

ICCV 2025arXiv:2508.01098
1
citations
#1663

IGD: Instructional Graphic Design with Multimodal Layer Generation

Yadong Qu, Shancheng Fang, Yuxin Wang et al.

ICCV 2025arXiv:2507.09910
1
citations
#1664

Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Imad Eddine MAROUF, Enzo Tartaglione, Stéphane Lathuilière et al.

ICCV 2025arXiv:2502.04469
1
citations
#1665

Benefit From Seen: Enhancing Open-Vocabulary Object Detection by Bridging Visual and Textual Co-Occurrence Knowledge

Yanqi Li, Jianwei Niu, Tao Ren

ICCV 2025
1
citations
#1666

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Chunxiao Li, Xiaoxiao Wang, Meiling Li et al.

ICCV 2025arXiv:2509.09172
1
citations
#1667

You Share Beliefs, I Adapt: Progressive Heterogeneous Collaborative Perception

hao si, Ehsan Javanmardi, Manabu Tsukada

ICCV 2025arXiv:2509.09310
1
citations
#1668

Robust Unfolding Network for HDR Imaging with Modulo Cameras

Zhile Chen, Hui Ji

ICCV 2025
1
citations
#1669

Embodied Navigation with Auxiliary Task of Action Description Prediction

Haru Kondoh, Asako Kanezaki

ICCV 2025arXiv:2510.21809
1
citations
#1670

IAP: Invisible Adversarial Patch Attack through Perceptibility-Aware Localization and Perturbation Optimization

Subrat Kishore Dutta, Xiao Zhang

ICCV 2025arXiv:2507.06856
1
citations
#1671

OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving

Mingqian Ji, Jian Yang, Shanshan Zhang

ICCV 2025arXiv:2506.23565
1
citations
#1672

Neural Compression for 3D Geometry Sets

Siyu Ren, Junhui Hou, Weiyao Lin et al.

ICCV 2025arXiv:2405.15034
1
citations
#1673

DAP-MAE: Domain-Adaptive Point Cloud Masked Autoencoder for Effective Cross-Domain Learning

Ziqi Gao, Qiufu Li, Linlin Shen

ICCV 2025highlightarXiv:2510.21635
1
citations
#1674

UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Dayong Su, Yafei Zhang, Huafeng Li et al.

ICCV 2025arXiv:2506.22736
1
citations
#1675

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025arXiv:2507.15249
1
citations
#1676

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

Zhixiang Guo, Siyuan Liang, Aishan Liu et al.

ICCV 2025arXiv:2412.01528
1
citations
#1677

Dataset Ownership Verification for Pre-trained Masked Models

Yuechen Xie, Jie Song, Yicheng Shan et al.

ICCV 2025arXiv:2507.12022
1
citations
#1678

From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning

Pengkun Jiao, Bin Zhu, Jingjing Chen et al.

ICCV 2025arXiv:2411.12787
1
citations
#1679

ClearSight: Human Vision-Inspired Solutions for Event-Based Motion Deblurring

Xiaopeng LIN, Yulong Huang, Hongwei Ren et al.

ICCV 2025arXiv:2501.15808
1
citations
#1680

UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions

Siyuan Yao, Rui Zhu, Ziqi Wang et al.

ICCV 2025arXiv:2507.00648
1
citations
#1681

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Xiwei Xuan, Ziquan Deng, Kwan-Liu Ma

ICCV 2025highlightarXiv:2506.21233
1
citations
#1682

PCR-GS: COLMAP-Free 3D Gaussian Splatting via Pose Co-Regularizations

YU WEI, Jiahui Zhang, Xiaoqin Zhang et al.

ICCV 2025arXiv:2507.13891
1
citations
#1683

Membership Inference Attacks with False Discovery Rate Control

Chenxu Zhao, Wei Qian, Aobo Chen et al.

ICCV 2025arXiv:2508.07066
1
citations
#1684

Blind Video Super-Resolution based on Implicit Kernels

Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.

ICCV 2025arXiv:2503.07856
1
citations
#1685

OmniDiff: A Comprehensive Benchmark for Fine-grained Image Difference Captioning

Yuan Liu, Saihui Hou, Saijie Hou et al.

ICCV 2025arXiv:2503.11093
1
citations
#1686

PLMP - Point-Line Minimal Problems for Projective SfM

Kim Kiehn, Albin Ahlbäck, Kathlén Kohn

ICCV 2025highlightarXiv:2503.04351
1
citations
#1687

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition

Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.

ICCV 2025arXiv:2503.15986
1
citations
#1688

SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference

Samir Khaki, Junxian Guo, Jiaming Tang et al.

ICCV 2025arXiv:2510.17777
1
citations
#1689

Decoding Correlation-Induced Misalignment in the Stable Diffusion Workflow for Text-to-Image Generation

Yunze Tong, Fengda Zhang, Didi Zhu et al.

ICCV 2025
1
citations
#1690

Steering Guidance for Personalized Text-to-Image Diffusion Models

Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.

ICCV 2025arXiv:2508.00319
1
citations
#1691

M-SpecGene: Generalized Foundation Model for RGBT Multispectral Vision

Kailai Zhou, Fuqiang Yang, Shixian Wang et al.

ICCV 2025arXiv:2507.16318
1
citations
#1692

MeshMamba: State Space Models for Articulated 3D Mesh Generation and Reconstruction

Yusuke Yoshiyasu, Leyuan Sun, Ryusuke Sagawa

ICCV 2025arXiv:2507.15212
1
citations
#1693

Learning Robust Image Watermarking with Lossless Cover Recovery

jiale chen, Wei Wang, Chongyang Shi et al.

ICCV 2025
1
citations
#1694

Seeing Through Deepfakes: A Human-Inspired Framework for Multi-Face Detection

Juan Hu, Shaojing Fan, Terence Sim

ICCV 2025arXiv:2507.14807
1
citations
#1695

Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images

Shunya Nagashima, Komei Sugiura

ICCV 2025arXiv:2508.07847
1
citations
#1696

Inference-Time Diffusion Model Distillation

Geon Yeong Park, Sang Wan Lee, Jong Ye

ICCV 2025arXiv:2412.08871
1
citations
#1697

Fusion Meets Diverse Conditions: A High-diversity Benchmark and Baseline for UAV-based Multimodal Object Detection with Condition Cues

Chen Chen, Kangcheng Bin, Hu Ting et al.

ICCV 2025arXiv:2510.13620
1
citations
#1698

Rethinking Detecting Salient and Camouflaged Objects in Unconstrained Scenes

Zhangjun Zhou, Yiping Li, Chunlin Zhong et al.

ICCV 2025arXiv:2412.10943
1
citations
#1699

Guiding Noisy Label Conditional Diffusion Models with Score-based Discriminator Correction

Dat Cong, Hieu Tran, Hoang Thanh-Tung

ICCV 2025arXiv:2508.19581
1
citations
#1700

TrackAny3D: Transferring Pretrained 3D Models for Category-unified 3D Point Cloud Tracking

Mengmeng Wang, Haonan Wang, Yulong Li et al.

ICCV 2025arXiv:2507.19908
1
citations
#1701

FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers

Junjie Zhang, Haisheng Su, Feixiang Song et al.

ICCV 2025arXiv:2510.15385
1
citations
#1702

Enhancing Numerical Prediction of MLLMs with Soft Labeling

Pei Wang, Zhaowei Cai, Hao Yang et al.

ICCV 2025
1
citations
#1703

MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Yikun Ma, Yiqing Li, Jiawei Wu et al.

ICCV 2025arXiv:2503.17695
1
citations
#1704

FPEM: Face Prior Enhanced Facial Attractiveness Prediction for Live Videos with Face Retouching

Hui Li, Xiaoyu Ren, Hongjiu Yu et al.

ICCV 2025highlight
1
citations
#1705

STD-GS: Exploring Frame-Event Interaction for SpatioTemporal-Disentangled Gaussian Splatting to Reconstruct High-Dynamic Scene

Hanyu Zhou, Haonan Wang, Haoyue Liu et al.

ICCV 2025arXiv:2506.23157
1
citations
#1706

RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning

Chengyu Zheng, Honghua Chen, Jin Huang et al.

ICCV 2025arXiv:2507.19950
1
citations
#1707

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Kaiyu Yue, Vasu Singla, Menglin Jia et al.

ICCV 2025arXiv:2505.22664
1
citations
#1708

OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection

Adrian Chow, Evelien Riddell, Yimu Wang et al.

ICCV 2025arXiv:2503.06435
1
citations
#1709

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

Mingqi Yuan, Bo Li, Xin Jin et al.

ICCV 2025arXiv:2503.06101
1
citations
#1710

ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction

Soonwoo Cha, Jiwoo Song, Juan Yeo et al.

ICCV 2025arXiv:2506.08678
1
citations
#1711

A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions

Youliang Zhang, Ronghui Li, Yachao Zhang et al.

ICCV 2025highlightarXiv:2412.17377
1
citations
#1712

ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads

Yifan Li, Xin Li, Tianqin Li et al.

ICCV 2025arXiv:2506.03433
1
citations
#1713

When Confidence Fails: Revisiting Pseudo-Label Selection in Semi-supervised Semantic Segmentation

Pan Liu, Jinshi Liu

ICCV 2025highlightarXiv:2509.16704
1
citations
#1714

Unlearning the Noisy Correspondence Makes CLIP More Robust

Haochen Han, Alex Jinpeng Wang, Peijun Ye et al.

ICCV 2025arXiv:2507.03434
1
citations
#1715

Global-Aware Monocular Semantic Scene Completion with State Space Models

Shijie Li, Zhongyao Cheng, Rong Li et al.

ICCV 2025arXiv:2503.06569
1
citations
#1716

DIMO: Diverse 3D Motion Generation for Arbitrary Objects

Linzhan Mou, Jiahui Lei, Chen Wang et al.

ICCV 2025highlightarXiv:2511.07409
1
citations
#1717

Beyond Blur: A Fluid Perspective on Generative Diffusion Models

Grzegorz Gruszczynski, Jakub Meixner, Michał Włodarczyk et al.

ICCV 2025arXiv:2506.16827
1
citations
#1718

Revisiting Adversarial Patch Defenses on Object Detectors: Unified Evaluation, Large-Scale Dataset, and New Insights

Junhao Zheng, Jiahao Sun, Chenhao Lin et al.

ICCV 2025arXiv:2508.00649
1
citations
#1719

AIM: Amending Inherent Interpretability via Self-Supervised Masking

Eyad Alshami, Shashank Agnihotri, Bernt Schiele et al.

ICCV 2025highlightarXiv:2508.11502
1
citations
#1720

One Last Attention for Your Vision-Language Model

Liang Chen, Ghazi Shazan Ahmad, Tianjun Yao et al.

ICCV 2025arXiv:2507.15480
1
citations
#1721

Text-IRSTD: Leveraging Semantic Text to Promote Infrared Small Target Detection in Complex Scenes

Feng Huang, Shuyuan Zheng, Zhaobing Qiu et al.

ICCV 2025arXiv:2503.07249
1
citations
#1722

Balancing Conservatism and Aggressiveness: Prototype-Affinity Hybrid Network for Few-Shot Segmentation

Tianyu Zou, Shengwu Xiong, Ruilin Yao et al.

ICCV 2025arXiv:2507.19140
1
citations
#1723

MCOP: Multi-UAV Collaborative Occupancy Prediction

Zefu Lin, Wenbo Chen, Xiaojuan Jin et al.

ICCV 2025arXiv:2510.12679
1
citations
#1724

Serialization based Point Cloud Oversegmentation

chenghui Lu, Dilong Li, Jianlong Kwan et al.

ICCV 2025
1
citations
#1725

Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen et al.

ICCV 2025arXiv:2506.21037
1
citations
#1726

Recognizing Actions from Robotic View for Natural Human-Robot Interaction

Ziyi Wang, Peiming Li, Hong Liu et al.

ICCV 2025arXiv:2507.22522
1
citations
#1727

DDB: Diffusion Driven Balancing to Address Spurious Correlations

Aryan Yazdan Parast, Basim Azam, Naveed Akhtar

ICCV 2025arXiv:2503.17226
1
citations
#1728

TurboVSR: Fantastic Video Upscalers and Where to Find Them

Zhongdao Wang, Guodongfang Zhao, Jingjing Ren et al.

ICCV 2025highlightarXiv:2506.23618
1
citations
#1729

FRET: Feature Redundancy Elimination for Test Time Adaptation

Linjing You, Jiabao Lu, Xiayuan Huang et al.

ICCV 2025arXiv:2505.10641
1
citations
#1730

SPA: Efficient User-Preference Alignment against Uncertainty in Medical Image Segmentation

Jiayuan Zhu, Junde Wu, Cheng Ouyang et al.

ICCV 2025arXiv:2411.15513
1
citations
#1731

Controllable and Expressive One-Shot Video Head Swapping

Chaonan Ji, Jinwei Qi, Peng Zhang et al.

ICCV 2025arXiv:2506.16852
1
citations
#1732

Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Yingping Liang, Yutao Hu, Wenqi Shao et al.

ICCV 2025arXiv:2507.00392
1
citations
#1733

Seeing 3D Through 2D Lenses: 3D Few-Shot Class-Incremental Learning via Cross-Modal Geometric Rectification

Tuo Xiang, Xuemiao Xu, Bangzhen Liu et al.

ICCV 2025arXiv:2509.14958
1
citations
#1734

RayGaussX: Accelerating Gaussian-Based Ray Marching for Real-Time and High-Quality Novel View Synthesis

Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic

ICCV 2025arXiv:2509.07782
1
citations
#1735

Learning Pixel-adaptive Multi-layer Perceptrons for Real-time Image Enhancement

Junyu Lou, Xiaorui Zhao, Kexuan Shi et al.

ICCV 2025arXiv:2507.12135
1
citations
#1736

CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering

xinyi zheng, Steve Zhang, Weizhe Lin et al.

ICCV 2025arXiv:2501.06927
1
citations
#1737

Information-Bottleneck Driven Binary Neural Network for Change Detection

Kaijie Yin, Zhiyuan Zhang, Shu Kong et al.

ICCV 2025arXiv:2507.03504
1
citations
#1738

VisHall3D: Monocular Semantic Scene Completion from Reconstructing the Visible Regions to Hallucinating the Invisible Regions

Haoang Lu, Yuanqi Su, Xiaoning Zhang et al.

ICCV 2025arXiv:2507.19188
1
citations
#1739

Evidential Knowledge Distillation

Liangyu Xiang, Junyu Gao, Changsheng Xu

ICCV 2025arXiv:2507.18366
1
citations
#1740

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Wei Suo, Ji Ma, Mengyang Sun et al.

ICCV 2025arXiv:2412.06458
1
citations
#1741

DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions

Hengyuan Zhang, Zhe Li, Xingqun Qi et al.

ICCV 2025arXiv:2508.17342
1
citations
#1742

TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity

Yuzhuo Chen, Zehua Ma, Han Fang et al.

ICCV 2025arXiv:2506.23484
1
citations
#1743

Diffusion-based 3D Hand Motion Recovery with Intuitive Physics

Yufei Zhang, Zijun Cui, Jeffrey Kephart et al.

ICCV 2025arXiv:2508.01835
1
citations
#1744

Language Decoupling with Fine-grained Knowledge Guidance for Referring Multi-object Tracking

guangyao Li, Siping Zhuang, Yajun Jian et al.

ICCV 2025
1
citations
#1745

Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy

Yaxin Xiao, Qingqing Ye, Li Hu et al.

ICCV 2025arXiv:2507.20573
1
citations
#1746

Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration

Shihao Zhou, Dayu Li, Jinshan Pan et al.

ICCV 2025arXiv:2503.20174
1
citations
#1747

EMatch: A Unified Framework for Event-based Optical Flow and Stereo Matching

Pengjie Zhang, Lin Zhu, Xiao Wang et al.

ICCV 2025arXiv:2407.21735
1
citations
#1748

CanFields: Consolidating Diffeomorphic Flows for Non-Rigid 4D Interpolation from Arbitrary-Length Sequences

Miaowei Wang, Changjian Li, Amir Vaxman

ICCV 2025arXiv:2406.18582
1
citations
#1749

QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

Jiahui Yang, Yongjia Ma, Donglin Di et al.

ICCV 2025arXiv:2507.04599
1
citations
#1750

Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation

Tim Elsner, Paula Usinger, Julius Nehring-Wirxel et al.

ICCV 2025arXiv:2411.10281
1
citations
#1751

PoseAnchor: Robust Root Position Estimation for 3D Human Pose Estimation

Jun-Hee Kim, Jumin Han, Seong-Whan Lee

ICCV 2025
1
citations
#1752

Self-Supervised Sparse Sensor Fusion for Long Range Perception

Edoardo Palladin, Samuel Brucker, Filippo Ghilotti et al.

ICCV 2025arXiv:2508.13995
1
citations
#1753

Implicit Counterfactual Learning for Audio-Visual Segmentation

Mingfeng Zha, Tianyu Li, Guoqing Wang et al.

ICCV 2025arXiv:2507.20740
1
citations
#1754

Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification

Daqian Shi, Xiaolei Diao, Xu Chen et al.

ICCV 2025arXiv:2506.23285
1
citations
#1755

AIComposer: Any Style and Content Image Composition via Feature Integration

Haowen Li, Zhenfeng Fan, Zhang Wen et al.

ICCV 2025arXiv:2507.20721
1
citations
#1756

Rethink Sparse Signals for Pose-guided Text-to-image Generation

Wenjie Xuan, Jing Zhang, Juhua Liu et al.

ICCV 2025arXiv:2506.20983
1
citations
#1757

Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions

Tommaso Galliena, Tommaso Apicella, Stefano Rosa et al.

ICCV 2025highlightarXiv:2504.08531
1
citations
#1758

Single-Scanline Relative Pose Estimation for Rolling Shutter Cameras

Petr Hruby, Marc Pollefeys

ICCV 2025arXiv:2506.22069
1
citations
#1759

OURO: A Self-Bootstrapped Framework for Enhancing Multimodal Scene Understanding

Tianrun Xu, Guanyu Chen, Ye Li et al.

ICCV 2025
1
citations
#1760

ResidualViT for Efficient Temporally Dense Video Encoding

Mattia Soldan, Fabian Caba Heilbron, Bernard Ghanem et al.

ICCV 2025highlightarXiv:2509.13255
1
citations
#1761

Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes.

Chuyan Zhang, Kefan Wang, Yun Gu

ICCV 2025arXiv:2507.00327
1
citations
#1762

MUG: Pseudo Labeling Augmented Audio-Visual Mamba Network for Audio-Visual Video Parsing

Langyu Wang, Langyu Wang, Yingying Chen et al.

ICCV 2025arXiv:2507.01384
1
citations
#1763

Progressive Homeostatic and Plastic Prompt Tuning for Audio-Visual Multi-Task Incremental Learning

Jiong Yin, Liang Li, Jiehua Zhang et al.

ICCV 2025arXiv:2507.21588
1
citations
#1764

AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts

Yufan Liu, Wanqian Zhang, Huashan Chen et al.

ICCV 2025arXiv:2510.24034
1
citations
#1765

FreeDance: Towards Harmonic Free-Number Group Dance Generation via a Unified Framework

Yiwen Zhao, Yang Wang, Liting Wen et al.

ICCV 2025
1
citations
#1766

LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection

Wei Liao, Chunyan Xu, Chenxu Wang et al.

ICCV 2025arXiv:2509.16970
1
citations
#1767

BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation

Yuanhong Yu, Xingyi He, Chen Zhao et al.

ICCV 2025arXiv:2504.07955
1
citations
#1768

SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement

Liwen Xiao, Zhiyu Pan, Zhicheng Wang et al.

ICCV 2025highlightarXiv:2507.04263
1
citations
#1769

Seam360GS: Seamless 360° Gaussian Splatting from Real-World Omnidirectional Images

Changha Shin, Woong Oh Cho, Seon Joo Kim

ICCV 2025arXiv:2508.20080
1
citations
#1770

SpikePack: Enhanced Information Flow in Spiking Neural Networks with High Hardware Compatibility

Guobin Shen, Jindong Li, Tenglong Li et al.

ICCV 2025arXiv:2501.14484
1
citations
#1771

FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection

Xinhua Lu, Runhe Lai, Yanqi Wu et al.

ICCV 2025arXiv:2507.04511
1
citations
#1772

ARMO: Autoregressive Rigging for Multi-Category Objects

mingze sun, Shiwei Mao, Keyi Chen et al.

ICCV 2025arXiv:2503.20663
1
citations
#1773

SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM

Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger

ICCV 2025highlightarXiv:2504.00139
1
citations
#1774

Mind the Cost of Scaffold! Benign Clients May Even Become Accomplices of Backdoor Attack

Xingshuo Han, Xuanye Zhang, Xiang Lan et al.

ICCV 2025arXiv:2411.16167
1
citations
#1775

BlinkTrack: Feature Tracking over 80 FPS via Events and Images

Yichen Shen, Yijin Li, Shuo Chen et al.

ICCV 2025arXiv:2409.17981
1
citations
#1776

DICE: Staleness-Centric Optimizations for Parallel Diffusion MoE Inference

Jiajun Luo, Lizhuo Luo, Jianru Xu et al.

ICCV 2025
1
citations
#1777

Measuring the Impact of Rotation Equivariance on Aerial Object Detection

Xiuyu Wu, Xinhao Wang, Xiubin Zhu et al.

ICCV 2025arXiv:2507.09896
1
citations
#1778

Wasserstein Style Distribution Analysis and Transform for Stylized Image Generation

Xi Yu, Xiang Gu, Zhihao Shi et al.

ICCV 2025highlight
1
citations
#1779

Visual Intention Grounding for Egocentric Assistants

Pengzhan Sun, Junbin Xiao, Tze Ho Elden Tse et al.

ICCV 2025arXiv:2504.13621
1
citations
#1780

MVQA: Mamba with Unified Sampling for Efficient Video Quality Assessment

Yachun Mi, Yu Li, Weicheng Meng et al.

ICCV 2025highlightarXiv:2504.16003
1
citations
#1781

Breaking Grid Constraints: Dynamic Graph Reconstruction Network for Multi-organ Segmentation

Junhao Xiao, Yang Wei, Jingyu Wang et al.

ICCV 2025
#1782

Prototype-based Contrastive Learning with Stage-wise Progressive Augmentation for Self-Supervised Fine-Grained Learning

BaoFeng Tan, Xiu-Shen Wei, Lin Zhao

ICCV 2025
#1783

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.

ICCV 2025highlightarXiv:2510.17023
#1784

Region-aware Anchoring Mechanism for Efficient Referring Visual Grounding

Shuyi Ouyang, Ziwei Niu, Hongyi Wang et al.

ICCV 2025
#1785

CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement

Feixiang Wang, Shuang Yang, Shiguang Shan et al.

ICCV 2025
#1786

Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal

Yitong Jiang, Jinwei Gu, Tianfan Xue et al.

ICCV 2025highlight
#1787

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Young Seok Jeon, Hongfei Yang, Huazhu Fu et al.

ICCV 2025arXiv:2403.18878
#1788

EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

Haokai Zhu, Bo Qu, Si-Yuan Cao et al.

ICCV 2025arXiv:2509.07662
#1789

Enhancing Mamba Decoder with Bidirectional Interaction in Multi-Task Dense Prediction

Mang Cao, Sanping Zhou, Yizhe Li et al.

ICCV 2025arXiv:2508.20376
#1790

Leveraging Debiased Cross-modal Attention Maps and Code-based Reasoning for Zero-shot Referring Expression Comprehension

Juntao Chen, Wen Shen, Zhihua Wei et al.

ICCV 2025
#1791

UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling

Peiming Li, Ziyi Wang, Yulin Yuan et al.

ICCV 2025arXiv:2508.14604
#1792

Vision-Language Neural Graph Featurization for Extracting Retinal Lesions

Taimur Hassan, Anabia Sohail, Muzammal Naseer et al.

ICCV 2025
#1793

SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models

Sudong Wang, Yunjian Zhang, Yao Zhu et al.

ICCV 2025
#1794

Flow-MIL: Constructing Highly-expressive Latent Feature Space For Whole Slide Image Classification Using Normalizing Flow

Yingfan MA, Bohan An, Ao Shen et al.

ICCV 2025
#1795

Towards Robustness of Person Search against Corruptions

Woojung Son, Yoonki Cho, Guoyuan An et al.

ICCV 2025
#1796

VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification

Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.

ICCV 2025
#1797

Engage for All: Making Ordinary Image Descriptions Appealing Again!

Yuyan Chen, Yifan Jiang, Li Zhou et al.

ICCV 2025
#1798

Automated Red Teaming for Text-to-Image Models through Feedback-Guided Prompt Iteration with Vision-Language Models

Wei Xu, Kangjie Chen, Jiawei Qiu et al.

ICCV 2025
#1799

Omni-scene Perception-oriented Point Cloud Geometry Enhancement for Coordinate Quantization

Wang Liu, Wei Gao

ICCV 2025
#1800

Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation

Zhenhua Ning, Zhuotao Tian, Shaoshuai Shi et al.

ICCV 2025arXiv:2506.23120