Most Cited ICCV "training trajectory" Papers

2,701 papers found • Page 9 of 14

Filters:Most Cited ICCV training trajectory Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#1601

UniGS: Modeling Unitary 3D Gaussians for Novel View Synthesis from Sparse-view Images

Jiamin WU, Kenkun Liu, Xiaoke Jiang et al.

ICCV 2025arXiv:2410.13195

citations

#1602

UnMix-NeRF: Spectral Unmixing Meets Neural Radiance Fields

Fabian Perez, Sara Rojas Martinez, Carlos Hinojosa et al.

ICCV 2025arXiv:2506.21884

citations

#1603

Efficient Spiking Point Mamba for Point Cloud Analysis

Peixi Wu, Bosong Chai, Menghua Zheng et al.

ICCV 2025arXiv:2504.14371

citations

#1604

Visual Surface Wave Elastography: Revealing Subsurface Physical Properties via Visible Surface Waves

Alexander Ogren, Berthy Feng, Jihoon Ahn et al.

ICCV 2025arXiv:2507.09207

citations

#1605

PolarAnything: Diffusion-based Polarimetric Image Synthesis

Kailong Zhang, Youwei Lyu, Heng Guo et al.

ICCV 2025highlightarXiv:2507.17268

citations

#1606

MergeOcc: Bridge the Domain Gap between Different LiDARs for Robust Occupancy Prediction

Zikun Xu, Shaobing Xu

ICCV 2025arXiv:2403.08512

citations

#1607

LoD-Loc v2: Aerial Visual Localization over Low Level-of-Detail City Models using Explicit Silhouette Alignment

Juelin Zhu, Shuaibang Peng, Long Wang et al.

ICCV 2025arXiv:2507.00659

citations

#1608

LANGTRAJ: Diffusion Model and Dataset for Language-Conditioned Trajectory Simulation

WEI-JER Chang, Masayoshi Tomizuka, Wei Zhan et al.

ICCV 2025arXiv:2504.11521

citations

#1609

ACE-G: Improving Generalization of Scene Coordinate Regression Through Query Pre-Training

Leonard Bruns, Axel Barroso-Laguna, Tommaso Cavallari et al.

ICCV 2025arXiv:2510.11605

citations

#1610

Adversarial Exploitation of Data Diversity Improves Visual Localization

Sihang Li, Siqi Tan, Bowen Chang et al.

ICCV 2025arXiv:2412.00138

citations

#1611

AlignDiff: Learning Physically-Grounded Camera Alignment via Diffusion

Liuyue Xie, Jiancong Guo, Ozan Cakmakci et al.

ICCV 2025arXiv:2503.21581

citations

#1612

SGAD: Semantic and Geometric-aware Descriptor for Local Feature Matching

Xiangzeng Liu, CHI WANG, Guanglu Shi et al.

ICCV 2025highlightarXiv:2508.02278

citations

#1613

Egocentric Action-aware Inertial Localization in Point Clouds with Vision-Language Guidance

Mingfang Zhang, Ryo Yonetani, Yifei Huang et al.

ICCV 2025arXiv:2505.14346

citations

#1614

Coordinate-based Speed of Sound Recovery for Aberration-Corrected Photoacoustic Computed Tomography

Tianao Li, Manxiu Cui, Cheng Ma et al.

ICCV 2025arXiv:2409.10876

citations

#1615

Purge-Gate: Efficient Backpropagation-Free Test-Time Adaptation for Point Clouds via Token purging

Moslem Yazdanpanah, Ali Bahri, Mehrdad Noori et al.

ICCV 2025

citations

#1616

CF3: Compact and Fast 3D Feature Fields

Hyunjoon Lee, Joonkyu Min, Jaesik Park

ICCV 2025arXiv:2508.05254

citations

#1617

ToF-Splatting: Dense SLAM using Sparse Time-of-Flight Depth and Multi-Frame Integration

Andrea Conti, Matteo Poggi, Valerio Cambareri et al.

ICCV 2025arXiv:2504.16545

citations

#1618

Unsupervised Imaging Inverse Problems with Diffusion Distribution Matching

Giacomo Meanti, Thomas Ryckeboer, Michael Arbel et al.

ICCV 2025arXiv:2506.14605

citations

#1619

Perspective-aware 3D Gaussian Inpainting with Multi-view Consistency

Yuxin CHENG, Binxiao Huang, Taiqiang Wu et al.

ICCV 2025arXiv:2510.10993

citations

#1620

Interaction-Merged Motion Planning: Effectively Leveraging Diverse Motion Datasets for Robust Planning

Giwon Lee, Wooseong Jeong, Daehee Park et al.

ICCV 2025highlightarXiv:2507.04790

citations

#1621

DONUT: A Decoder-Only Model for Trajectory Prediction

Markus Knoche, Daan de Geus, Bastian Leibe

ICCV 2025arXiv:2506.06854

citations

#1622

PanoSplatt3R: Leveraging Perspective Pretraining for Generalized Unposed Wide-Baseline Panorama Reconstruction

Jiahui Ren, Mochu Xiang, Jiajun Zhu et al.

ICCV 2025arXiv:2507.21960

citations

#1623

Variance-Based Pruning for Accelerating and Compressing Trained Networks

Uranik Berisha, Jens Mehnert, Alexandru Condurache

ICCV 2025arXiv:2507.12988

citations

#1624

Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)

Lennart Bastian, Mohammad Rashed, Nassir Navab et al.

ICCV 2025arXiv:2508.07775

citations

#1625

Certifiably Optimal Anisotropic Rotation Averaging

Carl Olsson, Yaroslava Lochman, Johan Malmport et al.

ICCV 2025arXiv:2503.07353

citations

#1626

MIORe & VAR-MIORe: Benchmarks to Push the Boundaries of Restoration

George Ciubotariu, Zhuyun Zhou, Zongwei Wu et al.

ICCV 2025arXiv:2509.06803

citations

#1627

E-SAM: Training-Free Segment Every Entity Model

WEIMING ZHANG, Dingwen Xiao, Lei Chen et al.

ICCV 2025arXiv:2503.12094

citations

#1628

Towards Foundational Models for Single-Chip Radar

Tianshu Huang, Akarsh Prabhakara, Chuhan Chen et al.

ICCV 2025arXiv:2509.12482

citations

#1629

Understanding Museum Exhibits using Vision-Language Reasoning

Ada-Astrid Balauca, Sanjana Garai, Stefan Balauca et al.

ICCV 2025arXiv:2412.01370

citations

#1630

Gradient Extrapolation for Debiased Representation Learning

Ihab Asaad, Maha Shadaydeh, Joachim Denzler

ICCV 2025arXiv:2503.13236

citations

#1631

InstantEdit: Text-Guided Few-Step Image Editing with Piecewise Rectified Flow

Yiming Gong, Zhen Zhu, Minjia Zhang

ICCV 2025arXiv:2508.06033

citations

#1632

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

Hyung Kyu Kim, Sangmin Lee, HAK GU KIM

ICCV 2025arXiv:2507.20562

citations

#1633

PARTE: Part-Guided Texturing for 3D Human Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Gyeongsik Moon et al.

ICCV 2025arXiv:2507.17332

citations

#1634

LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching

Meng Tian, Shuo Yang, Xinxiao Wu

ICCV 2025arXiv:2506.23502

citations

#1635

ShadowHack: Hacking Shadows via Luminance-Color Divide and Conquer

Jin Hu, Mingjia Li, Xiaojie Guo

ICCV 2025arXiv:2412.02545

citations

#1636

Easy3D: A Simple Yet Effective Method for 3D Interactive Segmentation

Andrea Simonelli, Norman Müller, Peter Kontschieder

ICCV 2025arXiv:2504.11024

citations

#1637

GeoProg3D: Compositional Visual Reasoning for City-Scale 3D Language Fields

Shunsuke Yasuki, Taiki Miyanishi, Nakamasa Inoue et al.

ICCV 2025arXiv:2506.23352

citations

#1638

Depth AnyEvent: A Cross-Modal Distillation Paradigm for Event-Based Monocular Depth Estimation

Luca Bartolomei, Enrico Mannocci, Fabio Tosi et al.

ICCV 2025arXiv:2509.15224

citations

#1639

WIPES: Wavelet-based Visual Primitives

Wenhao Zhang, Hao Zhu, Delong Wu et al.

ICCV 2025arXiv:2508.12615

citations

#1640

Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation

Guopeng Li, Qiang Wang, Ke Yan et al.

ICCV 2025arXiv:2410.12342

citations

#1641

Memory-Efficient Generative Models via Product Quantization

Jie Shao, Hanxiao Zhang, Hao Yu et al.

ICCV 2025

citations

#1642

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation

Jiahua Dong, Hui Yin, Wenqi Liang et al.

ICCV 2025arXiv:2508.08612

citations

#1643

RAGD: Regional-Aware Diffusion Model for Text-to-Image Generation

Chen Zhennan, Yajie Li, Haofan Wang et al.

ICCV 2025

citations

#1644

Domain Generalizable Portrait Style Transfer

Xinbo Wang, Wenju Xu, Qing Zhang et al.

ICCV 2025arXiv:2507.04243

citations

#1645

VQ-SGen: A Vector Quantized Stroke Representation for Creative Sketch Generation

Jiawei Wang, Zhiming Cui, Changjian Li

ICCV 2025arXiv:2411.16446

citations

#1646

G2PDiffusion: Cross-species Genotype-to-Phenotype Prediction via Evolutionary Diffusion

Mengdi Liu, Zhangyang Gao, Hong Chang et al.

ICCV 2025arXiv:2502.04684

citations

#1647

Task-Specific Zero-shot Quantization-Aware Training for Object Detection

Changhao Li, Xinrui Chen, Ji Wang et al.

ICCV 2025arXiv:2507.16782

citations

#1648

DexH2R: A Benchmark for Dynamic Dexterous Grasping in Human-to-Robot Handover

Youzhuo Wang, jiayi ye, Chuyang Xiao et al.

ICCV 2025arXiv:2506.23152

citations

#1649

Latent Expression Generation for Referring Image Segmentation and Grounding

Seonghoon Yu, Junbeom Hong, Joonseok Lee et al.

ICCV 2025arXiv:2508.05123

citations

#1650

BASIC: Boosting Visual Alignment with Intrinsic Refined Embeddings in Multimodal Large Language Models

Jianting Tang, Yubo Wang, Haoyu Cao et al.

ICCV 2025arXiv:2508.06895

citations

#1651

Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework

Yi-Ting Chen, Ting-Hsuan Liao, Pengsheng Guo et al.

ICCV 2025arXiv:2508.04090

citations

#1652

Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation

Shengfang ZHAI, Jiajun Li, Yue Liu et al.

ICCV 2025highlightarXiv:2503.06453

citations

#1653

MMAT-1M: A Large Reasoning Dataset for Multimodal Agent Tuning

Tianhong Gao, Yannian Fu, Weiqun Wu et al.

ICCV 2025arXiv:2507.21924

citations

#1654

SALAD -- Semantics-Aware Logical Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ICCV 2025arXiv:2509.02101

citations

#1655

Visual Relation Diffusion for Human-Object Interaction Detection

Ping Cao, Yepeng Tang, Chunjie Zhang et al.

ICCV 2025

citations

#1656

WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction

Richard Liu, Daniel Fu, Noah Tan et al.

ICCV 2025arXiv:2505.04813

citations

#1657

GSV3D: Gaussian Splatting-based Geometric Distillation with Stable Video Diffusion for Single-Image 3D Object Generation

Ye Tao, jiawei zhang, Yahao Shi et al.

ICCV 2025arXiv:2503.06136

citations

#1658

METEOR: Multi-Encoder Collaborative Token Pruning for Efficient Vision Language Models

Yuchen Liu, Yaoming Wang, Bowen Shi et al.

ICCV 2025arXiv:2507.20842

citations

#1659

PixelStitch: Structure-Preserving Pixel-Wise Bidirectional Warps for Unsupervised Image Stitching

Hengzhe Jin, Lang Nie, Chunyu Lin et al.

ICCV 2025

citations

#1660

S$^3$E: Self-Supervised State Estimation for Radar-Inertial System

Shengpeng Wang, Yulong Xie, Qing Liao et al.

ICCV 2025arXiv:2509.25984

citations

#1661

Video Color Grading via Look-Up Table Generation

Seunghyun Shin, Dongmin Shin, Jisu Shin et al.

ICCV 2025arXiv:2508.00548

citations

#1662

Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting

Yuekun Dai, Haitian Li, Shangchen Zhou et al.

ICCV 2025arXiv:2508.01098

citations

#1663

IGD: Instructional Graphic Design with Multimodal Layer Generation

Yadong Qu, Shancheng Fang, Yuxin Wang et al.

ICCV 2025arXiv:2507.09910

citations

#1664

Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Imad Eddine MAROUF, Enzo Tartaglione, Stéphane Lathuilière et al.

ICCV 2025arXiv:2502.04469

citations

#1665

Benefit From Seen: Enhancing Open-Vocabulary Object Detection by Bridging Visual and Textual Co-Occurrence Knowledge

Yanqi Li, Jianwei Niu, Tao Ren

ICCV 2025

citations

#1666

Bridging the Gap Between Ideal and Real-world Evaluation: Benchmarking AI-Generated Image Detection in Challenging Scenarios

Chunxiao Li, Xiaoxiao Wang, Meiling Li et al.

ICCV 2025arXiv:2509.09172

citations

#1667

You Share Beliefs, I Adapt: Progressive Heterogeneous Collaborative Perception

hao si, Ehsan Javanmardi, Manabu Tsukada

ICCV 2025arXiv:2509.09310

citations

#1668

Robust Unfolding Network for HDR Imaging with Modulo Cameras

Zhile Chen, Hui Ji

ICCV 2025

citations

#1669

Embodied Navigation with Auxiliary Task of Action Description Prediction

Haru Kondoh, Asako Kanezaki

ICCV 2025arXiv:2510.21809

citations

#1670

IAP: Invisible Adversarial Patch Attack through Perceptibility-Aware Localization and Perturbation Optimization

Subrat Kishore Dutta, Xiao Zhang

ICCV 2025arXiv:2507.06856

citations

#1671

OcRFDet: Object-Centric Radiance Fields for Multi-View 3D Object Detection in Autonomous Driving

Mingqian Ji, Jian Yang, Shanshan Zhang

ICCV 2025arXiv:2506.23565

citations

#1672

Neural Compression for 3D Geometry Sets

Siyu Ren, Junhui Hou, Weiyao Lin et al.

ICCV 2025arXiv:2405.15034

citations

#1673

DAP-MAE: Domain-Adaptive Point Cloud Masked Autoencoder for Effective Cross-Domain Learning

Ziqi Gao, Qiufu Li, Linlin Shen

ICCV 2025highlightarXiv:2510.21635

citations

#1674

UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Dayong Su, Yafei Zhang, Huafeng Li et al.

ICCV 2025arXiv:2506.22736

citations

#1675

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025arXiv:2507.15249

citations

#1676

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

Zhixiang Guo, Siyuan Liang, Aishan Liu et al.

ICCV 2025arXiv:2412.01528

citations

#1677

Dataset Ownership Verification for Pre-trained Masked Models

Yuechen Xie, Jie Song, Yicheng Shan et al.

ICCV 2025arXiv:2507.12022

citations

#1678

From Holistic to Localized: Local Enhanced Adapters for Efficient Visual Instruction Fine-Tuning

Pengkun Jiao, Bin Zhu, Jingjing Chen et al.

ICCV 2025arXiv:2411.12787

citations

#1679

ClearSight: Human Vision-Inspired Solutions for Event-Based Motion Deblurring

Xiaopeng LIN, Yulong Huang, Hongwei Ren et al.

ICCV 2025arXiv:2501.15808

citations

#1680

UMDATrack: Unified Multi-Domain Adaptive Tracking Under Adverse Weather Conditions

Siyuan Yao, Rui Zhu, Ziqi Wang et al.

ICCV 2025arXiv:2507.00648

citations

#1681

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Xiwei Xuan, Ziquan Deng, Kwan-Liu Ma

ICCV 2025highlightarXiv:2506.21233

citations

#1682

PCR-GS: COLMAP-Free 3D Gaussian Splatting via Pose Co-Regularizations

YU WEI, Jiahui Zhang, Xiaoqin Zhang et al.

ICCV 2025arXiv:2507.13891

citations

#1683

Membership Inference Attacks with False Discovery Rate Control

Chenxu Zhao, Wei Qian, Aobo Chen et al.

ICCV 2025arXiv:2508.07066

citations

#1684

Blind Video Super-Resolution based on Implicit Kernels

Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.

ICCV 2025arXiv:2503.07856

citations

#1685

OmniDiff: A Comprehensive Benchmark for Fine-grained Image Difference Captioning

Yuan Liu, Saihui Hou, Saijie Hou et al.

ICCV 2025arXiv:2503.11093

citations

#1686

PLMP - Point-Line Minimal Problems for Projective SfM

Kim Kiehn, Albin Ahlbäck, Kathlén Kohn

ICCV 2025highlightarXiv:2503.04351

citations

#1687

SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition

Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.

ICCV 2025arXiv:2503.15986

citations

#1688

SparseVILA: Decoupling Visual Sparsity for Efficient VLM Inference

Samir Khaki, Junxian Guo, Jiaming Tang et al.

ICCV 2025arXiv:2510.17777

citations

#1689

Decoding Correlation-Induced Misalignment in the Stable Diffusion Workflow for Text-to-Image Generation

Yunze Tong, Fengda Zhang, Didi Zhu et al.

ICCV 2025

citations

#1690

Steering Guidance for Personalized Text-to-Image Diffusion Models

Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.

ICCV 2025arXiv:2508.00319

citations

#1691

M-SpecGene: Generalized Foundation Model for RGBT Multispectral Vision

Kailai Zhou, Fuqiang Yang, Shixian Wang et al.

ICCV 2025arXiv:2507.16318

citations

#1692

MeshMamba: State Space Models for Articulated 3D Mesh Generation and Reconstruction

Yusuke Yoshiyasu, Leyuan Sun, Ryusuke Sagawa

ICCV 2025arXiv:2507.15212

citations

#1693

Learning Robust Image Watermarking with Lossless Cover Recovery

jiale chen, Wei Wang, Chongyang Shi et al.

ICCV 2025

citations

#1694

Seeing Through Deepfakes: A Human-Inspired Framework for Multi-Face Detection

Juan Hu, Shaojing Fan, Terence Sim

ICCV 2025arXiv:2507.14807

citations

#1695

Deep Space Weather Model: Long-Range Solar Flare Prediction from Multi-Wavelength Images

Shunya Nagashima, Komei Sugiura

ICCV 2025arXiv:2508.07847

citations

#1696

Inference-Time Diffusion Model Distillation

Geon Yeong Park, Sang Wan Lee, Jong Ye

ICCV 2025arXiv:2412.08871

citations

#1697

Fusion Meets Diverse Conditions: A High-diversity Benchmark and Baseline for UAV-based Multimodal Object Detection with Condition Cues

Chen Chen, Kangcheng Bin, Hu Ting et al.

ICCV 2025arXiv:2510.13620

citations

#1698

Rethinking Detecting Salient and Camouflaged Objects in Unconstrained Scenes

Zhangjun Zhou, Yiping Li, Chunlin Zhong et al.

ICCV 2025arXiv:2412.10943

citations

#1699

Guiding Noisy Label Conditional Diffusion Models with Score-based Discriminator Correction

Dat Cong, Hieu Tran, Hoang Thanh-Tung

ICCV 2025arXiv:2508.19581

citations

#1700

TrackAny3D: Transferring Pretrained 3D Models for Category-unified 3D Point Cloud Tracking

Mengmeng Wang, Haonan Wang, Yulong Li et al.

ICCV 2025arXiv:2507.19908

citations

#1701

FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers

Junjie Zhang, Haisheng Su, Feixiang Song et al.

ICCV 2025arXiv:2510.15385

citations

#1702

Enhancing Numerical Prediction of MLLMs with Soft Labeling

Pei Wang, Zhaowei Cai, Hao Yang et al.

ICCV 2025

citations

#1703

MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Yikun Ma, Yiqing Li, Jiawei Wu et al.

ICCV 2025arXiv:2503.17695

citations

#1704

FPEM: Face Prior Enhanced Facial Attractiveness Prediction for Live Videos with Face Retouching

Hui Li, Xiaoyu Ren, Hongjiu Yu et al.

ICCV 2025highlight

citations

#1705

STD-GS: Exploring Frame-Event Interaction for SpatioTemporal-Disentangled Gaussian Splatting to Reconstruct High-Dynamic Scene

Hanyu Zhou, Haonan Wang, Haoyue Liu et al.

ICCV 2025arXiv:2506.23157

citations

#1706

RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning

Chengyu Zheng, Honghua Chen, Jin Huang et al.

ICCV 2025arXiv:2507.19950

citations

#1707

Zero-Shot Vision Encoder Grafting via LLM Surrogates

Kaiyu Yue, Vasu Singla, Menglin Jia et al.

ICCV 2025arXiv:2505.22664

citations

#1708

OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection

Adrian Chow, Evelien Riddell, Yimu Wang et al.

ICCV 2025arXiv:2503.06435

citations

#1709

ULTHO: Ultra-Lightweight yet Efficient Hyperparameter Optimization in Deep Reinforcement Learning

Mingqi Yuan, Bo Li, Xin Jin et al.

ICCV 2025arXiv:2503.06101

citations

#1710

ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction

Soonwoo Cha, Jiwoo Song, Juan Yeo et al.

ICCV 2025arXiv:2506.08678

citations

#1711

A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions

Youliang Zhang, Ronghui Li, Yachao Zhang et al.

ICCV 2025highlightarXiv:2412.17377

citations

#1712

ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads

Yifan Li, Xin Li, Tianqin Li et al.

ICCV 2025arXiv:2506.03433

citations

#1713

When Confidence Fails: Revisiting Pseudo-Label Selection in Semi-supervised Semantic Segmentation

Pan Liu, Jinshi Liu

ICCV 2025highlightarXiv:2509.16704

citations

#1714

Unlearning the Noisy Correspondence Makes CLIP More Robust

Haochen Han, Alex Jinpeng Wang, Peijun Ye et al.

ICCV 2025arXiv:2507.03434

citations

#1715

Global-Aware Monocular Semantic Scene Completion with State Space Models

Shijie Li, Zhongyao Cheng, Rong Li et al.

ICCV 2025arXiv:2503.06569

citations

#1716

DIMO: Diverse 3D Motion Generation for Arbitrary Objects

Linzhan Mou, Jiahui Lei, Chen Wang et al.

ICCV 2025highlightarXiv:2511.07409

citations

#1717

Beyond Blur: A Fluid Perspective on Generative Diffusion Models

Grzegorz Gruszczynski, Jakub Meixner, Michał Włodarczyk et al.

ICCV 2025arXiv:2506.16827

citations

#1718

Revisiting Adversarial Patch Defenses on Object Detectors: Unified Evaluation, Large-Scale Dataset, and New Insights

Junhao Zheng, Jiahao Sun, Chenhao Lin et al.

ICCV 2025arXiv:2508.00649

citations

#1719

AIM: Amending Inherent Interpretability via Self-Supervised Masking

Eyad Alshami, Shashank Agnihotri, Bernt Schiele et al.

ICCV 2025highlightarXiv:2508.11502

citations

#1720

One Last Attention for Your Vision-Language Model

Liang Chen, Ghazi Shazan Ahmad, Tianjun Yao et al.

ICCV 2025arXiv:2507.15480

citations

#1721

Text-IRSTD: Leveraging Semantic Text to Promote Infrared Small Target Detection in Complex Scenes

Feng Huang, Shuyuan Zheng, Zhaobing Qiu et al.

ICCV 2025arXiv:2503.07249

citations

#1722

Balancing Conservatism and Aggressiveness: Prototype-Affinity Hybrid Network for Few-Shot Segmentation

Tianyu Zou, Shengwu Xiong, Ruilin Yao et al.

ICCV 2025arXiv:2507.19140

citations

#1723

MCOP: Multi-UAV Collaborative Occupancy Prediction

Zefu Lin, Wenbo Chen, Xiaojuan Jin et al.

ICCV 2025arXiv:2510.12679

citations

#1724

Serialization based Point Cloud Oversegmentation

chenghui Lu, Dilong Li, Jianlong Kwan et al.

ICCV 2025

citations

#1725

Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen et al.

ICCV 2025arXiv:2506.21037

citations

#1726

Recognizing Actions from Robotic View for Natural Human-Robot Interaction

Ziyi Wang, Peiming Li, Hong Liu et al.

ICCV 2025arXiv:2507.22522

citations

#1727

DDB: Diffusion Driven Balancing to Address Spurious Correlations

Aryan Yazdan Parast, Basim Azam, Naveed Akhtar

ICCV 2025arXiv:2503.17226

citations

#1728

TurboVSR: Fantastic Video Upscalers and Where to Find Them

Zhongdao Wang, Guodongfang Zhao, Jingjing Ren et al.

ICCV 2025highlightarXiv:2506.23618

citations

#1729

FRET: Feature Redundancy Elimination for Test Time Adaptation

Linjing You, Jiabao Lu, Xiayuan Huang et al.

ICCV 2025arXiv:2505.10641

citations

#1730

SPA: Efficient User-Preference Alignment against Uncertainty in Medical Image Segmentation

Jiayuan Zhu, Junde Wu, Cheng Ouyang et al.

ICCV 2025arXiv:2411.15513

citations

#1731

Controllable and Expressive One-Shot Video Head Swapping

Chaonan Ji, Jinwei Qi, Peng Zhang et al.

ICCV 2025arXiv:2506.16852

citations

#1732

Learning Dense Feature Matching via Lifting Single 2D Image to 3D Space

Yingping Liang, Yutao Hu, Wenqi Shao et al.

ICCV 2025arXiv:2507.00392

citations

#1733

Seeing 3D Through 2D Lenses: 3D Few-Shot Class-Incremental Learning via Cross-Modal Geometric Rectification

Tuo Xiang, Xuemiao Xu, Bangzhen Liu et al.

ICCV 2025arXiv:2509.14958

citations

#1734

RayGaussX: Accelerating Gaussian-Based Ray Marching for Real-Time and High-Quality Novel View Synthesis

Hugo Blanc, Jean-Emmanuel Deschaud, Alexis Paljic

ICCV 2025arXiv:2509.07782

citations

#1735

Learning Pixel-adaptive Multi-layer Perceptrons for Real-time Image Enhancement

Junyu Lou, Xiaorui Zhao, Kexuan Shi et al.

ICCV 2025arXiv:2507.12135

citations

#1736

CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering

xinyi zheng, Steve Zhang, Weizhe Lin et al.

ICCV 2025arXiv:2501.06927

citations

#1737

Information-Bottleneck Driven Binary Neural Network for Change Detection

Kaijie Yin, Zhiyuan Zhang, Shu Kong et al.

ICCV 2025arXiv:2507.03504

citations

#1738

VisHall3D: Monocular Semantic Scene Completion from Reconstructing the Visible Regions to Hallucinating the Invisible Regions

Haoang Lu, Yuanqi Su, Xiaoning Zhang et al.

ICCV 2025arXiv:2507.19188

citations

#1739

Evidential Knowledge Distillation

Liangyu Xiang, Junyu Gao, Changsheng Xu

ICCV 2025arXiv:2507.18366

citations

#1740

Pruning All-Rounder: Rethinking and Improving Inference Efficiency for Large Vision Language Models

Wei Suo, Ji Ma, Mengyang Sun et al.

ICCV 2025arXiv:2412.06458

citations

#1741

DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions

Hengyuan Zhang, Zhe Li, Xingqun Qi et al.

ICCV 2025arXiv:2508.17342

citations

#1742

TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity

Yuzhuo Chen, Zehua Ma, Han Fang et al.

ICCV 2025arXiv:2506.23484

citations

#1743

Diffusion-based 3D Hand Motion Recovery with Intuitive Physics

Yufei Zhang, Zijun Cui, Jeffrey Kephart et al.

ICCV 2025arXiv:2508.01835

citations

#1744

Language Decoupling with Fine-grained Knowledge Guidance for Referring Multi-object Tracking

guangyao Li, Siping Zhuang, Yajun Jian et al.

ICCV 2025

citations

#1745

Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy

Yaxin Xiao, Qingqing Ye, Li Hu et al.

ICCV 2025arXiv:2507.20573

citations

#1746

Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration

Shihao Zhou, Dayu Li, Jinshan Pan et al.

ICCV 2025arXiv:2503.20174

citations

#1747

EMatch: A Unified Framework for Event-based Optical Flow and Stereo Matching

Pengjie Zhang, Lin Zhu, Xiao Wang et al.

ICCV 2025arXiv:2407.21735

citations

#1748

CanFields: Consolidating Diffeomorphic Flows for Non-Rigid 4D Interpolation from Arbitrary-Length Sequences

Miaowei Wang, Changjian Li, Amir Vaxman

ICCV 2025arXiv:2406.18582

citations

#1749

QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

Jiahui Yang, Yongjia Ma, Donglin Di et al.

ICCV 2025arXiv:2507.04599

citations

#1750

Multidimensional Byte Pair Encoding: Shortened Sequences for Improved Visual Data Generation

Tim Elsner, Paula Usinger, Julius Nehring-Wirxel et al.

ICCV 2025arXiv:2411.10281

citations

#1751

PoseAnchor: Robust Root Position Estimation for 3D Human Pose Estimation

Jun-Hee Kim, Jumin Han, Seong-Whan Lee

ICCV 2025

citations

#1752

Self-Supervised Sparse Sensor Fusion for Long Range Perception

Edoardo Palladin, Samuel Brucker, Filippo Ghilotti et al.

ICCV 2025arXiv:2508.13995

citations

#1753

Implicit Counterfactual Learning for Audio-Visual Segmentation

Mingfeng Zha, Tianyu Li, Guoqing Wang et al.

ICCV 2025arXiv:2507.20740

citations

#1754

Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification

Daqian Shi, Xiaolei Diao, Xu Chen et al.

ICCV 2025arXiv:2506.23285

citations

#1755

AIComposer: Any Style and Content Image Composition via Feature Integration

Haowen Li, Zhenfeng Fan, Zhang Wen et al.

ICCV 2025arXiv:2507.20721

citations

#1756

Rethink Sparse Signals for Pose-guided Text-to-image Generation

Wenjie Xuan, Jing Zhang, Juhua Liu et al.

ICCV 2025arXiv:2506.20983

citations

#1757

Embodied Image Captioning: Self-supervised Learning Agents for Spatially Coherent Image Descriptions

Tommaso Galliena, Tommaso Apicella, Stefano Rosa et al.

ICCV 2025highlightarXiv:2504.08531

citations

#1758

Single-Scanline Relative Pose Estimation for Rolling Shutter Cameras

Petr Hruby, Marc Pollefeys

ICCV 2025arXiv:2506.22069

citations

#1759

OURO: A Self-Bootstrapped Framework for Enhancing Multimodal Scene Understanding

Tianrun Xu, Guanyu Chen, Ye Li et al.

ICCV 2025

citations

#1760

ResidualViT for Efficient Temporally Dense Video Encoding

Mattia Soldan, Fabian Caba Heilbron, Bernard Ghanem et al.

ICCV 2025highlightarXiv:2509.13255

citations

#1761

Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes.

Chuyan Zhang, Kefan Wang, Yun Gu

ICCV 2025arXiv:2507.00327

citations

#1762

MUG: Pseudo Labeling Augmented Audio-Visual Mamba Network for Audio-Visual Video Parsing

Langyu Wang, Langyu Wang, Yingying Chen et al.

ICCV 2025arXiv:2507.01384

citations

#1763

Progressive Homeostatic and Plastic Prompt Tuning for Audio-Visual Multi-Task Incremental Learning

Jiong Yin, Liang Li, Jiehua Zhang et al.

ICCV 2025arXiv:2507.21588

citations

#1764

AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts

Yufan Liu, Wanqian Zhang, Huashan Chen et al.

ICCV 2025arXiv:2510.24034

citations

#1765

FreeDance: Towards Harmonic Free-Number Group Dance Generation via a Unified Framework

Yiwen Zhao, Yang Wang, Liting Wen et al.

ICCV 2025

citations

#1766

LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection

Wei Liao, Chunyan Xu, Chenxu Wang et al.

ICCV 2025arXiv:2509.16970

citations

#1767

BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation

Yuanhong Yu, Xingyi He, Chen Zhao et al.

ICCV 2025arXiv:2504.07955

citations

#1768

SRefiner: Soft-Braid Attention for Multi-Agent Trajectory Refinement

Liwen Xiao, Zhiyu Pan, Zhicheng Wang et al.

ICCV 2025highlightarXiv:2507.04263

citations

#1769

Seam360GS: Seamless 360° Gaussian Splatting from Real-World Omnidirectional Images

Changha Shin, Woong Oh Cho, Seon Joo Kim

ICCV 2025arXiv:2508.20080

citations

#1770

SpikePack: Enhanced Information Flow in Spiking Neural Networks with High Hardware Compatibility

Guobin Shen, Jindong Li, Tenglong Li et al.

ICCV 2025arXiv:2501.14484

citations

#1771

FA: Forced Prompt Learning of Vision-Language Models for Out-of-Distribution Detection

Xinhua Lu, Runhe Lai, Yanqi Wu et al.

ICCV 2025arXiv:2507.04511

citations

#1772

ARMO: Autoregressive Rigging for Multi-Category Objects

mingze sun, Shiwei Mao, Keyi Chen et al.

ICCV 2025arXiv:2503.20663

citations

#1773

SuperEvent: Cross-Modal Learning of Event-based Keypoint Detection for SLAM

Yannick Burkhardt, Simon Schaefer, Stefan Leutenegger

ICCV 2025highlightarXiv:2504.00139

citations

#1774

Mind the Cost of Scaffold! Benign Clients May Even Become Accomplices of Backdoor Attack

Xingshuo Han, Xuanye Zhang, Xiang Lan et al.

ICCV 2025arXiv:2411.16167

citations

#1775

BlinkTrack: Feature Tracking over 80 FPS via Events and Images

Yichen Shen, Yijin Li, Shuo Chen et al.

ICCV 2025arXiv:2409.17981

citations

#1776

DICE: Staleness-Centric Optimizations for Parallel Diffusion MoE Inference

Jiajun Luo, Lizhuo Luo, Jianru Xu et al.

ICCV 2025

citations

#1777

Measuring the Impact of Rotation Equivariance on Aerial Object Detection

Xiuyu Wu, Xinhao Wang, Xiubin Zhu et al.

ICCV 2025arXiv:2507.09896

citations

#1778

Wasserstein Style Distribution Analysis and Transform for Stylized Image Generation

Xi Yu, Xiang Gu, Zhihao Shi et al.

ICCV 2025highlight

citations

#1779

Visual Intention Grounding for Egocentric Assistants

Pengzhan Sun, Junbin Xiao, Tze Ho Elden Tse et al.

ICCV 2025arXiv:2504.13621

citations

#1780

MVQA: Mamba with Unified Sampling for Efficient Video Quality Assessment

Yachun Mi, Yu Li, Weicheng Meng et al.

ICCV 2025highlightarXiv:2504.16003

citations

#1781

Breaking Grid Constraints: Dynamic Graph Reconstruction Network for Multi-organ Segmentation

Junhao Xiao, Yang Wei, Jingyu Wang et al.

ICCV 2025

#1782

Prototype-based Contrastive Learning with Stage-wise Progressive Augmentation for Self-Supervised Fine-Grained Learning

BaoFeng Tan, Xiu-Shen Wei, Lin Zhao

ICCV 2025

#1783

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.

ICCV 2025highlightarXiv:2510.17023

#1784

Region-aware Anchoring Mechanism for Efficient Referring Visual Grounding

Shuyi Ouyang, Ziwei Niu, Hongyi Wang et al.

ICCV 2025

#1785

CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement

Feixiang Wang, Shuang Yang, Shiguang Shan et al.

ICCV 2025

#1786

Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal

Yitong Jiang, Jinwei Gu, Tianfan Xue et al.

ICCV 2025highlight

#1787

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Young Seok Jeon, Hongfei Yang, Huazhu Fu et al.

ICCV 2025arXiv:2403.18878

#1788

EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

Haokai Zhu, Bo Qu, Si-Yuan Cao et al.

ICCV 2025arXiv:2509.07662

#1789

Enhancing Mamba Decoder with Bidirectional Interaction in Multi-Task Dense Prediction

Mang Cao, Sanping Zhou, Yizhe Li et al.

ICCV 2025arXiv:2508.20376

#1790

Leveraging Debiased Cross-modal Attention Maps and Code-based Reasoning for Zero-shot Referring Expression Comprehension

Juntao Chen, Wen Shen, Zhihua Wei et al.

ICCV 2025

#1791

UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling

Peiming Li, Ziyi Wang, Yulin Yuan et al.

ICCV 2025arXiv:2508.14604

#1792

Vision-Language Neural Graph Featurization for Extracting Retinal Lesions

Taimur Hassan, Anabia Sohail, Muzammal Naseer et al.

ICCV 2025

#1793

SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models

Sudong Wang, Yunjian Zhang, Yao Zhu et al.

ICCV 2025

#1794

Flow-MIL: Constructing Highly-expressive Latent Feature Space For Whole Slide Image Classification Using Normalizing Flow

Yingfan MA, Bohan An, Ao Shen et al.

ICCV 2025

#1795

Towards Robustness of Person Search against Corruptions

Woojung Son, Yoonki Cho, Guoyuan An et al.

ICCV 2025

#1796

VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification

Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.

ICCV 2025

#1797

Engage for All: Making Ordinary Image Descriptions Appealing Again!

Yuyan Chen, Yifan Jiang, Li Zhou et al.

ICCV 2025

#1798

Automated Red Teaming for Text-to-Image Models through Feedback-Guided Prompt Iteration with Vision-Language Models

Wei Xu, Kangjie Chen, Jiawei Qiu et al.

ICCV 2025

#1799

Omni-scene Perception-oriented Point Cloud Geometry Enhancement for Coordinate Quantization

Wang Liu, Wei Gao

ICCV 2025

#1800

Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation

Zhenhua Ning, Zhuotao Tian, Shaoshuai Shi et al.

ICCV 2025arXiv:2506.23120

← Previous

1...7 8 9 10 11...14