Most Cited ICCV "higher-order models" Papers

2,701 papers found • Page 4 of 14

Filters:Most Cited ICCV higher-order models Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#601

Collaborative Instance Object Navigation: Leveraging Uncertainty-Awareness to Minimize Human-Agent Dialogues

Francesco Taioli, Edoardo Zorzi, Gianni Franchi et al.

ICCV 2025posterarXiv:2412.01250

citations

#602

RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction

Yufeng Zhong, Chengjian Feng, Feng yan et al.

ICCV 2025posterarXiv:2503.18525

citations

#603

Sparse Fine-Tuning of Transformers for Generative Tasks

Wei Chen, Jingxi Yu, Zichen Miao et al.

ICCV 2025posterarXiv:2507.10855

citations

#604

VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions

Marko Mihajlovic, Siwei Zhang, Gen Li et al.

ICCV 2025highlightarXiv:2506.23236

citations

#605

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

Jiaxin Lu, Gang Hua, Qixing Huang

ICCV 2025posterarXiv:2410.11816

citations

#606

ADIEE: Automatic Dataset Creation and Scorer for Instruction-Guided Image Editing Evaluation

Sherry Chen, Yi Wei, Luowei Zhou et al.

ICCV 2025posterarXiv:2507.07317

citations

#607

Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery

Xiao Liu, Nan Pu, Haiyang Zheng et al.

ICCV 2025posterarXiv:2507.04051

citations

#608

Kestrel: 3D Multimodal LLM for Part-Aware Grounded Description

Mahmoud Ahmed, Junjie Fei, Jian Ding et al.

ICCV 2025posterarXiv:2405.18937

citations

#609

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.

ICCV 2025posterarXiv:2412.00622

citations

#610

SMoLoRA: Exploring and Defying Dual Catastrophic Forgetting in Continual Visual Instruction Tuning

Ziqi Wang, Chang Che, Qi Wang et al.

ICCV 2025posterarXiv:2411.13949

citations

#611

Predict-Optimize-Distill: A Self-Improving Cycle for 4D Object Understanding

Mingxuan Wu, Huang Huang, Justin Kerr et al.

ICCV 2025posterarXiv:2504.17441

citations

#612

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025posterarXiv:2405.13337

citations

#613

ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling

Jinhyung Park, Javier Romero, Shunsuke Saito et al.

ICCV 2025posterarXiv:2508.15767

citations

#614

On the Generalization of Representation Uncertainty in Earth Observation

Spyros Kondylatos, Nikolaos Ioannis Bountos, Dimitrios Michail et al.

ICCV 2025posterarXiv:2503.07082

citations

#615

LayerD: Decomposing Raster Graphic Designs into Layers

Tomoyuki Suzuki, Kang-Jun Liu, Naoto Inoue et al.

ICCV 2025posterarXiv:2509.25134

citations

#616

PriOr-Flow: Enhancing Primitive Panoramic Optical Flow with Orthogonal View

Longliang Liu, Miaojie Feng, Junda Cheng et al.

ICCV 2025highlightarXiv:2506.23897

citations

#617

CAP: Evaluation of Persuasive and Creative Image Generation

Aysan Aghazadeh, Adriana Kovashka

ICCV 2025posterarXiv:2412.10426

citations

#618

VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding

Minchao Jiang, Shunyu Jia, Jiaming Gu et al.

ICCV 2025posterarXiv:2506.22799

citations

#619

TokensGen: Harnessing Condensed Tokens for Long Video Generation

Wenqi Ouyang, Zeqi Xiao, Danni Yang et al.

ICCV 2025posterarXiv:2507.15728

citations

#620

From Image to Video: An Empirical Study of Diffusion Representations

Pedro Vélez, Luisa Polania Cabrera, Yi Yang et al.

ICCV 2025highlightarXiv:2502.07001

citations

#621

PlugMark: A Plug-in Zero-Watermarking Framework for Diffusion Models

Pengzhen Chen, Yanwei Liu, Xiaoyan Gu et al.

ICCV 2025poster

citations

#622

Joint Diffusion Models in Continual Learning

Paweł Skierś, Kamil Deja

ICCV 2025posterarXiv:2411.08224

citations

#623

Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement

Priyank Pathak, Yogesh Rawat

ICCV 2025posterarXiv:2507.07230

citations

#624

DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization

Aniket Roy, Shubhankar Borse, Shreya Kadambi et al.

ICCV 2025posterarXiv:2504.13206

citations

#625

Leveraging the Power of MLLMs for Gloss-Free Sign Language Translation

Jungeun Kim, Hyeongwoo Jeon, Jongseong Bae et al.

ICCV 2025posterarXiv:2411.16789

citations

#626

DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction

Rui Wang, Quentin Lohmeyer, Mirko Meboldt et al.

ICCV 2025posterarXiv:2503.13176

citations

#627

SCAN: Bootstrapping Contrastive Pre-training for Data Efficiency

Yangyang Guo, Mohan Kankanhalli

ICCV 2025posterarXiv:2411.09126

citations

#628

O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views

Lorenzo Mur-Labadia, Maria Santos-Villafranca, Jesus Bermudez-cameo et al.

ICCV 2025posterarXiv:2506.06026

citations

#629

SceneMI: Motion In-betweening for Modeling Human-Scene Interaction

Inwoo Hwang, Bing Zhou, Young Min Kim et al.

ICCV 2025highlightarXiv:2503.16289

citations

#630

Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation

Junyu Xie, Tengda Han, Max Bain et al.

ICCV 2025posterarXiv:2504.01020

citations

#631

Pinco: Position-induced Consistent Adapter for Diffusion Transformer in Foreground-conditioned Inpainting

Guangben Lu, Yuzhen N/A, Zhimin Sun et al.

ICCV 2025posterarXiv:2412.03812

citations

#632

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu, Kai Han

ICCV 2025posterarXiv:2509.04582

citations

#633

VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges

Yuxuan Wang, Yiqi Song, Cihang Xie et al.

ICCV 2025posterarXiv:2409.01071

citations

#634

Task Vector Quantization for Memory-Efficient Model Merging

Youngeun Kim, Seunghwan Lee, Aecheon Jung et al.

ICCV 2025posterarXiv:2503.06921

citations

#635

From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection

Zexi Jia, Chuanwei Huang, Hongyan Fei et al.

ICCV 2025posterarXiv:2507.04769

citations

#636

Enhancing Few-Shot Vision-Language Classification with Large Multimodal Model Features

Chancharik Mitra, Brandon Huang, Tianning Chai et al.

ICCV 2025posterarXiv:2412.00142

citations

#637

Fair Generation without Unfair Distortions: Debiasing Text-to-Image Generation with Entanglement-Free Attention

Jeonghoon Park, Juyoung Lee, Chaeyeon Chung et al.

ICCV 2025posterarXiv:2506.13298

citations

#638

GRAB: A Challenging GRaph Analysis Benchmark for Large Multimodal Models

Jonathan Roberts, Kai Han, Samuel Albanie

ICCV 2025posterarXiv:2408.11817

citations

#639

Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations

Hai Huang, Yan Xia, Sashuai Zhou et al.

ICCV 2025posterarXiv:2507.03304

citations

#640

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Jiahui Geng, Qing Li

ICCV 2025posterarXiv:2503.14530

citations

#641

MP-HSIR: A Multi-Prompt Framework for Universal Hyperspectral Image Restoration

Zhehui Wu, Yong Chen, Naoto Yokoya et al.

ICCV 2025posterarXiv:2503.09131

citations

#642

PropVG: End-to-End Proposal-Driven Visual Grounding with Multi-Granularity Discrimination

Ming Dai, Wenxuan Cheng, Jiedong Zhuang et al.

ICCV 2025posterarXiv:2509.04833

citations

#643

Joint Self-Supervised Video Alignment and Action Segmentation

Ali Shah Ali, Syed Ahmed Mahmood, Mubin Saeed et al.

ICCV 2025posterarXiv:2503.16832

citations

#644

Open-ended Hierarchical Streaming Video Understanding with Vision Language Models

Hyolim Kang, Yunsu Park, Youngbeom Yoo et al.

ICCV 2025posterarXiv:2509.12145

citations

#645

Weakly Supervised Visible-Infrared Person Re-Identification via Heterogeneous Expert Collaborative Consistency Learning

Yafei Zhang, Lingqi Kong, Huafeng Li et al.

ICCV 2025posterarXiv:2507.12942

citations

#646

Monocular Semantic Scene Completion via Masked Recurrent Networks

Xuzhi Wang, Xinran Wu, Song Wang et al.

ICCV 2025posterarXiv:2507.17661

citations

#647

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving

Zixian Guo, Ming Liu, Qilong Wang et al.

ICCV 2025poster

citations

#648

MVGBench: a Comprehensive Benchmark for Multi-view Generation Models

Xianghui Xie, Jan Lenssen, Gerard Pons-Moll

ICCV 2025poster

citations

#649

Towards Stabilized and Efficient Diffusion Transformers through Long-Skip-Connections with Spectral Constraints

Guanjie Chen, Xinyu Zhao, Yucheng Zhou et al.

ICCV 2025posterarXiv:2411.17616

citations

#650

You Think, You ACT: The New Task of Arbitrary Text to Motion Generation

Runqi Wang, Caoyuan Ma, Guopeng Li et al.

ICCV 2025posterarXiv:2404.14745

citations

#651

INTER: Mitigating Hallucination in Large Vision-Language Models by Interaction Guidance Sampling

Xin Dong, Shichao Dong, Jin Wang et al.

ICCV 2025posterarXiv:2507.05056

citations

#652

TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Ilya A. Petrov, Riccardo Marin, Julian Chibane et al.

ICCV 2025posterarXiv:2412.06334

citations

#653

TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation

Zonglin Lyu, Chen Chen

ICCV 2025posterarXiv:2507.04984

citations

#654

FREE-Merging: Fourier Transform for Efficient Model Merging

Shenghe Zheng, Hongzhi Wang

ICCV 2025posterarXiv:2411.16815

citations

#655

Cross-Subject Mind Decoding from Inaccurate Representations

Yangyang Xu, Bangzhen Liu, Wenqi Shao et al.

ICCV 2025posterarXiv:2507.19071

citations

#656

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Lukas Kuhn, sari sadiya, Jörg Schlötterer et al.

ICCV 2025posterarXiv:2501.00942

citations

#657

MPG-SAM 2: Adapting SAM 2 with Mask Priors and Global Context for Referring Video Object Segmentation

Fu Rong, Meng Lan, Qian Zhang et al.

ICCV 2025posterarXiv:2501.13667

citations

#658

TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis

Tri Ton, Ji Woo Hong, Chang Yoo

ICCV 2025posterarXiv:2504.05684

citations

#659

Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting

Jiaxin Huang, Sheng Miao, Bangbang Yang et al.

ICCV 2025posterarXiv:2504.11092

citations

#660

SHeaP: Self-supervised Head Geometry Predictor Learned via 2D Gaussians

Liam Schoneveld, Zhe Chen, Davide Davoli et al.

ICCV 2025posterarXiv:2504.12292

citations

#661

Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

Miroslav Purkrabek, Jiri Matas

ICCV 2025posterarXiv:2412.01562

citations

#662

Learning Streaming Video Representation via Multitask Training

Yibin Yan, Jilan Xu, Shangzhe Di et al.

ICCV 2025posterarXiv:2504.20041

citations

#663

FiffDepth: Feed-forward Transformation of Diffusion-Based Generators for Detailed Depth Estimation

Yunpeng Bai, Qixing Huang

ICCV 2025posterarXiv:2412.00671

citations

#664

Forgetting Through Transforming: Enabling Federated Unlearning via Class-Aware Representation Transformation

Qi Guo, Zhen Tian, Minghao Yao et al.

ICCV 2025posterarXiv:2410.06848

citations

#665

DMesh++: An Efficient Differentiable Mesh for Complex Shapes

Sanghyun Son, Matheus Gadelha, Yang Zhou et al.

ICCV 2025posterarXiv:2412.16776

citations

#666

What You Have is What You Track: Adaptive and Robust Multimodal Tracking

Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.

ICCV 2025posterarXiv:2507.05899

citations

#667

A Visual Leap in CLIP Compositionality Reasoning through Generation of Counterfactual Sets

Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.

ICCV 2025posterarXiv:2507.04699

citations

#668

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

ICCV 2025posterarXiv:2506.23581

citations

#669

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

Li Huaqiu, Yong Wang, Tongwen Huang et al.

ICCV 2025posterarXiv:2507.00790

citations

#670

Spatial-Temporal Aware Visuomotor Diffusion Policy Learning

Zhenyang Liu, Yikai Wang, Kuanning Wang et al.

ICCV 2025posterarXiv:2507.06710

citations

#671

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars

Byungjun Kim, Shunsuke Saito, Giljoo Nam et al.

ICCV 2025posterarXiv:2507.19481

citations

#672

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

Quang-Binh Nguyen, Minh Luu, Quang Nguyen et al.

ICCV 2025posterarXiv:2507.13984

citations

#673

ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment

Chong Xia, Shengjun Zhang, Fangfu Liu et al.

ICCV 2025posterarXiv:2507.19058

citations

#674

Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

Xiangdong Zhang, Shaofeng Zhang, Junchi Yan

ICCV 2025posterarXiv:2509.01250

citations

#675

LINR-PCGC: Lossless Implicit Neural Representations for Point Cloud Geometry Compression

Wenjie Huang, Qi Yang, Shuting Xia et al.

ICCV 2025posterarXiv:2507.15686

citations

#676

What Changed and What Could Have Changed? State-Change Counterfactuals for Procedure-Aware Video Representation Learning

Chi-Hsi Kung, Frangil Ramirez, Juhyung Ha et al.

ICCV 2025posterarXiv:2503.21055

citations

#677

An Inversion-based Measure of Memorization for Diffusion Models

Zhe Ma, Qingming Li, Xuhong Zhang et al.

ICCV 2025posterarXiv:2405.05846

citations

#678

Demeter: A Parametric Model of Crop Plant Morphology from the Real World

Tianhang Cheng, Albert Zhai, Evan Chen et al.

ICCV 2025posterarXiv:2510.16377

citations

#679

RTMap: Real-Time Recursive Mapping with Change Detection and Localization

Yuheng Du, Sheng Yang, Lingxuan Wang et al.

ICCV 2025posterarXiv:2507.00980

citations

#680

Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads

Yingjie Zhou, Jiezhang Cao, Zicheng Zhang et al.

ICCV 2025posterarXiv:2507.23343

citations

#681

SMGDiff: Soccer Motion Generation using Diffusion Probabilistic Models

Hongdi Yang, Chengyang Li, Zhenxuan Wu et al.

ICCV 2025posterarXiv:2411.16216

citations

#682

EgoMusic-driven Human Dance Motion Estimation with Skeleton Mamba

Quang Nguyen, Nhat Le, Baoru Huang et al.

ICCV 2025posterarXiv:2508.10522

citations

#683

Efficient Autoregressive Shape Generation via Octree-Based Adaptive Tokenization

Kangle Deng, Hsueh-Ti Derek Liu, Yiheng Zhu et al.

ICCV 2025posterarXiv:2504.02817

citations

#684

You Are Your Own Best Teacher: Achieving Centralized-level Performance in Federated Learning under Heterogeneous and Long-tailed Data

Shanshan Yan, Zexi Li, Chao Wu et al.

ICCV 2025posterarXiv:2503.06916

citations

#685

Supercharging Floorplan Localization with Semantic Rays

Yuval Grader, Hadar Averbuch-Elor

ICCV 2025posterarXiv:2507.09291

citations

#686

PseudoMapTrainer: Learning Online Mapping without HD Maps

Christian Löwens, Thorben Funke, Jingchao Xie et al.

ICCV 2025posterarXiv:2508.18788

citations

#687

MOSAIC: Generating Consistent, Privacy-Preserving Scenes from Multiple Depth Views in Multi-Room Environments

Zhixuan Liu, Haokun Zhu, Rui Chen et al.

ICCV 2025posterarXiv:2503.13816

citations

#688

DIP: Unsupervised Dense In-Context Post-training of Visual Representations

Sophia Sirko-Galouchenko, Spyros Gidaris, Antonin Vobecky et al.

ICCV 2025posterarXiv:2506.18463

citations

#689

Towards Open-World Generation of Stereo Images and Unsupervised Matching

Feng Qiao, Zhexiao Xiong, Eric Xing et al.

ICCV 2025posterarXiv:2503.12720

citations

#690

Leveraging Local Patch Alignment to Seam-cutting for Large Parallax Image Stitching

Tianli Liao, Chenyang Zhao, Lei Li et al.

ICCV 2025posterarXiv:2311.18564

citations

#691

Diffusion Image Prior

Hamadi Chihaoui, Paolo Favaro

ICCV 2025posterarXiv:2503.21410

citations

#692

Improving Rectified Flow with Boundary Conditions

Xixi Hu, Runlong Liao, Bo Liu et al.

ICCV 2025posterarXiv:2506.15864

citations

#693

InstaScene: Towards Complete 3D Instance Decomposition and Reconstruction from Cluttered Scenes

Zesong Yang, Bangbang Yang, Wenqi Dong et al.

ICCV 2025posterarXiv:2507.08416

citations

#694

F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration

Lu Liu, Huiyu Duan, Qiang Hu et al.

ICCV 2025highlightarXiv:2412.13155

citations

#695

Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

Jianing Zhang, Jiayi Zhu, Feiyu Ji et al.

ICCV 2025highlightarXiv:2506.22753

citations

#696

EgoAdapt: Adaptive Multisensory Distillation and Policy Learning for Efficient Egocentric Perception

Sanjoy Chowdhury, Subrata Biswas, Sayan Nag et al.

ICCV 2025posterarXiv:2506.21080

citations

#697

FB-Diff: Fourier Basis-guided Diffusion for Temporal Interpolation of 4D Medical Imaging

Xin You, Runze Yang, Chuyan Zhang et al.

ICCV 2025posterarXiv:2507.04547

citations

#698

Noise2Score3D: Tweedie's Approach for Unsupervised Point Cloud Denoising

Xiangbin Wei, Yuanfeng Wang, Ao XU et al.

ICCV 2025posterarXiv:2503.09283

citations

#699

OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization

Saihui Hou, Panjian Huang, Zengbin Wang et al.

ICCV 2025posterarXiv:2410.00204

citations

#700

Consensus-Driven Active Model Selection

Justin Kay, Grant Horn, Subhransu Maji et al.

ICCV 2025highlightarXiv:2507.23771

citations

#701

LookOut: Real-World Humanoid Egocentric Navigation

Boxiao Pan, Adam Harley, Francis Engelmann et al.

ICCV 2025posterarXiv:2508.14466

citations

#702

Generative Active Learning for Long-tail Trajectory Prediction via Controllable Diffusion Model

Daehee Park, Monu Surana, Pranav Desai et al.

ICCV 2025posterarXiv:2507.22615

citations

#703

Prior2Former - Evidential Modeling of Mask Transformers for Assumption-Free Open-World Panoptic Segmentation

Sebastian Schmidt, Julius Koerner, Dominik Fuchsgruber et al.

ICCV 2025highlightarXiv:2504.04841

citations

#704

CoMoGaussian: Continuous Motion-Aware Gaussian Splatting from Motion-Blurred Images

Jungho Lee, DongHyeong Kim, Dogyoon Lee et al.

ICCV 2025posterarXiv:2503.05332

citations

#705

SketchSplat: 3D Edge Reconstruction via Differentiable Multi-view Sketch Splatting

Haiyang Ying, Matthias Zwicker

ICCV 2025posterarXiv:2503.14786

citations

#706

SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation

Hao Ban, Gokul Ram Subramani, Kaiyi Ji

ICCV 2025posterarXiv:2507.07883

citations

#707

Object-level Correlation for Few-Shot Segmentation

chunlin wen, Yu Zhang, Jie Fan et al.

ICCV 2025posterarXiv:2509.07917

citations

#708

GCRayDiffusion: Pose-Free Surface Reconstruction via Geometric Consistent Ray Diffusion

Li-Heng Chen, Zi-Xin Zou, Chang Liu et al.

ICCV 2025posterarXiv:2503.22349

citations

#709

TAViS: Text-bridged Audio-Visual Segmentation with Foundation Models

Ziyang Luo, Nian Liu, Xuguang Yang et al.

ICCV 2025posterarXiv:2506.11436

citations

#710

Teaching VLMs to Localize Specific Objects from In-context Examples

Sivan Doveh, Nimrod Shabtay, Eli Schwartz et al.

ICCV 2025posterarXiv:2411.13317

citations

#711

Generative Adversarial Diffusion

U-Chae Jun, Jaeeun Ko, Jiwoo Kang

ICCV 2025poster

citations

#712

Cross-Architecture Distillation Made Simple with Redundancy Suppression

Weijia Zhang, Yuehao Liu, Wu Ran et al.

ICCV 2025highlightarXiv:2507.21844

citations

#713

Boosting Vision Semantic Density with Anatomy Normality Modeling for Medical Vision-language Pre-training

Weiwei Cao, Jianpeng Zhang, Zhongyi Shui et al.

ICCV 2025posterarXiv:2508.03742

citations

#714

VideoAds for Fast-Paced Video Understanding

Zheyuan Zhang, Wanying Dou, Linkai Peng et al.

ICCV 2025posterarXiv:2504.09282

citations

#715

Sim-DETR: Unlock DETR for Temporal Sentence Grounding

Jiajin Tang, Zhengxuan Wei, Yuchen Zhu et al.

ICCV 2025posterarXiv:2509.23867

citations

#716

COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation

Sanghyun Jo, Seo Lee, Seungwoo Lee et al.

ICCV 2025posterarXiv:2503.11439

citations

#717

ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models

Guoyizhe Wei, Rama Chellappa

ICCV 2025posterarXiv:2504.00037

citations

#718

Stable-Sim2Real: Exploring Simulation of Real-Captured 3D Data with Two-Stage Depth Diffusion

Mutian Xu, Chongjie Ye, Haolin Liu et al.

ICCV 2025highlightarXiv:2507.23483

citations

#719

ViCTr: Vital Consistency Transfer for Pathology Aware Image Synthesis

Onkar Susladkar, Gayatri Deshmukh, Yalcin Tur et al.

ICCV 2025posterarXiv:2505.04963

citations

#720

Timestep-Aware Diffusion Model for Extreme Image Rescaling

Ce Wang, Zhenyu Hu, Wanjie Sun et al.

ICCV 2025posterarXiv:2408.09151

citations

#721

TRACE: Learning 3D Gaussian Physical Dynamics from Multi-view Videos

Jinxi Li, Ziyang Song, Bo Yang

ICCV 2025posterarXiv:2508.09811

citations

#722

Exploiting Domain Properties in Language-Driven Domain Generalization for Semantic Segmentation

Seogkyu Jeon, Kibeom Hong, Hyeran Byun

ICCV 2025posterarXiv:2512.03508

citations

#723

Alleviating Textual Reliance in Medical Language-guided Segmentation via Prototype-driven Semantic Approximation

Shuchang Ye, Usman Naseem, Mingyuan Meng et al.

ICCV 2025posterarXiv:2507.11055

citations

#724

Quantifying and Narrowing the Unknown: Interactive Text-to-Video Retrieval via Uncertainty Minimization

Bingqing Zhang, Zhuo Cao, Heming Du et al.

ICCV 2025posterarXiv:2507.15504

citations

#725

AMD: Adaptive Momentum and Decoupled Contrastive Learning Framework for Robust Long-Tail Trajectory Prediction

Bin Rao, Haicheng Liao, Yanchen Guan et al.

ICCV 2025posterarXiv:2507.01801

citations

#726

PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior

Seunggwan Lee, Hwanhee Jung, ByoungSoo Koh et al.

ICCV 2025posterarXiv:2503.12834

citations

#727

Everything is a Video: Unifying Modalities through Next-Frame Prediction

G Thomas Hudson, Dean Slack, Thomas Winterbottom et al.

ICCV 2025posterarXiv:2411.10503

citations

#728

CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation

Xiao Lin, Yun Peng, Liuyi Wang et al.

ICCV 2025posterarXiv:2502.01312

citations

#729

CAFA: a Controllable Automatic Foley Artist

Roi Benita, Michael Finkelson, Tavi Halperin et al.

ICCV 2025posterarXiv:2504.06778

citations

#730

SAGI: Semantically Aligned and Uncertainty Guided AI Image Inpainting

Paschalis Giakoumoglou, Dimitrios Karageorgiou, Symeon Papadopoulos et al.

ICCV 2025posterarXiv:2502.06593

citations

#731

MixA-Q: Revisiting Activation Sparsity for Vision Transformers from a Mixed-Precision Quantization Perspective

Weitian Wang, Shubham rai, Cecilia De la Parra et al.

ICCV 2025posterarXiv:2507.19131

citations

#732

Towards a Universal 3D Medical Multi-modality Generalization via Learning Personalized Invariant Representation

Zhaorui Tan, Xi Yang, Tan Pan et al.

ICCV 2025posterarXiv:2411.06106

citations

#733

Refer to Any Segmentation Mask Group With Vision-Language Prompts

Shengcao Cao, Zijun Wei, Jason Kuen et al.

ICCV 2025posterarXiv:2506.05342

citations

#734

Denoising Token Prediction in Masked Autoregressive Models

Ting Yao, Yehao Li, Yingwei Pan et al.

ICCV 2025poster

citations

#735

MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation

Vladislav Bargatin, Egor Chistov, Alexander Yakovenko et al.

ICCV 2025highlightarXiv:2506.23151

citations

#736

Understanding Co-speech Gestures in-the-wild

Sindhu Hegde, K R Prajwal, Taein Kwon et al.

ICCV 2025posterarXiv:2503.22668

citations

#737

Subjective Camera 1.0: Bridging Human Cognition and Visual Reconstruction through Sequence-Aware Sketch-Guided Diffusion

Haoyang Chen, Dongfang Sun, Caoyuan Ma et al.

ICCV 2025posterarXiv:2506.23711

citations

#738

Global Regulation and Excitation via Attention Tuning for Stereo Matching

Jiahao LI, Xinhong Chen, Zhengmin JIANG et al.

ICCV 2025posterarXiv:2509.15891

citations

#739

Training-Free Generation of Temporally Consistent Rewards from VLMs

Yinuo Zhao, Jiale Yuan, Zhiyuan Xu et al.

ICCV 2025posterarXiv:2507.04789

citations

#740

Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models

Xiao Liang, Di Wang, Zhicheng Jiao et al.

ICCV 2025posterarXiv:2507.09209

citations

#741

IDF: Iterative Dynamic Filtering Networks for Generalizable Image Denoising

Dongjin Kim, Jaekyun Ko, Muhammad Kashif Ali et al.

ICCV 2025posterarXiv:2508.19649

citations

#742

A Hidden Stumbling Block in Generalized Category Discovery: Distracted Attention

Qiyu Xu, Zhanxuan Hu, Yu Duan et al.

ICCV 2025posterarXiv:2507.14315

citations

#743

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing

Heyi Sun, Cong Wang, Tian-Xing Xu et al.

ICCV 2025posterarXiv:2508.09597

citations

#744

SDMatte: Grafting Diffusion Models for Interactive Matting

Longfei Huang, Yu Liang, Hao Zhang et al.

ICCV 2025posterarXiv:2508.00443

citations

#745

UniConvNet: Expanding Effective Receptive Field while Maintaining Asymptotically Gaussian Distribution for ConvNets of Any Scale

Yuhao Wang, Wei Xi

ICCV 2025posterarXiv:2508.09000

citations

#746

Identity Preserving 3D Head Stylization with Multiview Score Distillation

Bahri Batuhan Bilecen, Ahmet Berke Gokmen, Furkan Güzelant et al.

ICCV 2025posterarXiv:2411.13536

citations

#747

Trokens: Semantic-Aware Relational Trajectory Tokens for Few-Shot Action Recognition

Pulkit Kumar, Shuaiyi Huang, Matthew Walmer et al.

ICCV 2025posterarXiv:2508.03695

citations

#748

MaTVLM: Hybrid Mamba-Transformer for Efficient Vision-Language Modeling

Yingyue Li, Bencheng Liao, Wenyu Liu et al.

ICCV 2025posterarXiv:2503.13440

citations

#749

G2SF: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection

Chengyu Tao, Xuanming Cao, Juan Du

ICCV 2025poster

citations

#750

Structure-aware Semantic Discrepancy and Consistency for 3D Medical Image Self-supervised Learning

Tan Pan, Zhaorui Tan, Kaiyu Guo et al.

ICCV 2025posterarXiv:2507.02581

citations

#751

RefEdit: A Benchmark and Method for Improving Instruction-based Image Editing Model on Referring Expressions

Bimsara Pathiraja, Maitreya Patel, Shivam Singh et al.

ICCV 2025posterarXiv:2506.03448

citations

#752

VA-MoE: Variables-Adaptive Mixture of Experts for Incremental Weather Forecasting

Hao Chen, Tao Han, Song Guo et al.

ICCV 2025posterarXiv:2412.02503

citations

#753

FaceCraft4D: Animated 3D Facial Avatar Generation from a Single Image

Fei Yin, Mallikarjun Reddy, Chun-Han Yao et al.

ICCV 2025posterarXiv:2504.15179

citations

#754

ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers

Nicholas DiBrita, Jason Han, Tirthak Patel

ICCV 2025posterarXiv:2506.21537

citations

#755

Efficient Multi-Person Motion Prediction by Lightweight Spatial and Temporal Interactions

Yuanhong Zheng, Ruixuan Yu, Jian Sun

ICCV 2025posterarXiv:2507.09446

citations

#756

EvolvingGrasp: Evolutionary Grasp Generation via Efficient Preference Alignment

Yufei Zhu, Yiming Zhong, Zemin Yang et al.

ICCV 2025posterarXiv:2503.14329

citations

#757

Ponimator: Unfolding Interactive Pose for Versatile Human-human Interaction Animation

Shaowei Liu, chuan guo, Bing Zhou et al.

ICCV 2025posterarXiv:2510.14976

citations

#758

Sequential Gaussian Avatars with Hierarchical Motion Context

Wangze Xu, Yifan Zhan, Zhihang Zhong et al.

ICCV 2025posterarXiv:2411.16768

citations

#759

Boosting Adversarial Transferability via Residual Perturbation Attack

Jinjia Peng, Zeze Tao, Huibing Wang et al.

ICCV 2025posterarXiv:2508.05689

citations

#760

Sequential keypoint density estimator: an overlooked baseline of skeleton-based video anomaly detection

Anja Delić, Matej Grcic, Siniša Šegvić

ICCV 2025highlightarXiv:2506.18368

citations

#761

GausSim: Foreseeing Reality by Gaussian Simulator for Elastic Objects

Yidi Shao, Mu Huang, Chen Change Loy et al.

ICCV 2025posterarXiv:2412.17804

citations

#762

Beyond Spatial Frequency: Pixel-wise Temporal Frequency-based Deepfake Video Detection

Taehoon Kim, Jongwook Choi, Yonghyun Jeong et al.

ICCV 2025highlightarXiv:2507.02398

citations

#763

Signs as Tokens: A Retrieval-Enhanced Multilingual Sign Language Generator

Ronglai Zuo, Rolandos Alexandros Potamias, Evangelos Ververas et al.

ICCV 2025posterarXiv:2411.17799

citations

#764

Learnable Feature Patches and Vectors for Boosting Low-light Image Enhancement without External Knowledge

Xiaogang Xu, Jiafei Wu, Qingsen Yan et al.

ICCV 2025poster

citations

#765

Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models

Xudong Li, Zihao Huang, Yan Zhang et al.

ICCV 2025posterarXiv:2409.05381

citations

#766

Multi-modal Multi-platform Person Re-Identification: Benchmark and Method

Ruiyang Ha, Songyi Jiang, Bin Li et al.

ICCV 2025posterarXiv:2503.17096

citations

#767

PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement

Tewodros W. Ayalew, Xiao Zhang, Kevin Y Wu et al.

ICCV 2025posterarXiv:2411.17764

citations

#768

Adaptive Routing of Text-to-Image Generation Requests Between Large Cloud Model and Light-Weight Edge Model

Zewei Xin, Qinya Li, Chaoyue Niu et al.

ICCV 2025posterarXiv:2411.13787

citations

#769

Can3Tok: Canonical 3D Tokenization and Latent Modeling of Scene-Level 3D Gaussians

Quankai Gao, Iliyan Georgiev, Tuanfeng Wang et al.

ICCV 2025posterarXiv:2508.01464

citations

#770

RoboPearls: Editable Video Simulation for Robot Manipulation

Tao Tang, Likui Zhang, Youpeng Wen et al.

ICCV 2025posterarXiv:2506.22756

citations

#771

VSC: Visual Search Compositional Text-to-Image Diffusion Model

Do Dat, Nam Hyeon-Woo, Po-Yuan Mao et al.

ICCV 2025posterarXiv:2505.01104

citations

#772

Exploiting Diffusion Prior for Task-driven Image Restoration

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

ICCV 2025posterarXiv:2507.22459

citations

#773

Unlocking Constraints: Source-Free Occlusion-Aware Seamless Segmentation

Yihong Cao, Jiaming Zhang, Xu Zheng et al.

ICCV 2025posterarXiv:2506.21198

citations

#774

A Unified Framework for Motion Reasoning and Generation in Human Interaction

Jeongeun Park, Sungjoon Choi, Sangdoo Yun

ICCV 2025posterarXiv:2410.05628

citations

#775

Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling

Qirui Wu, Denys Iliash, Daniel Ritchie et al.

ICCV 2025highlightarXiv:2411.19492

citations

#776

GraspCoT: Integrating Physical Property Reasoning for 6-DoF Grasping under Flexible Language Instructions

Xiaomeng Chu, Jiajun Deng, Guoliang You et al.

ICCV 2025posterarXiv:2503.16013

citations

#777

Multi-Object Sketch Animation by Scene Decomposition and Motion Planning

Jingyu Liu, Zijie Xin, Yuhan Fu et al.

ICCV 2025posterarXiv:2503.19351

citations

#778

TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation

Yinda Chen, Haoyuan Shi, Xiaoyu Liu et al.

ICCV 2025posterarXiv:2405.16847

citations

#779

From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning

Yuhui Zeng, Haoxiang Wu, Wenjie Nie et al.

ICCV 2025posterarXiv:2502.05843

citations

#780

AdvDreamer Unveils: Are Vision-Language Models Truly Ready for Real-World 3D Variations?

Shouwei Ruan, Hanqing Liu, Yao Huang et al.

ICCV 2025highlightarXiv:2412.03002

citations

#781

Physics Context Builders: A Modular Framework for Physical Reasoning in Vision-Language Models

Vahid Balazadeh, Mohammadmehdi Ataei, Hyunmin Cheong et al.

ICCV 2025posterarXiv:2412.08619

citations

#782

DCT-Shield: A Robust Frequency Domain Defense against Malicious Image Editing

Aniruddha Bala, Rohit Chowdhury, Rohan Jaiswal et al.

ICCV 2025highlightarXiv:2504.17894

citations

#783

A Structure-aware and Motion-adaptive Framework for 3D Human Pose Estimation with Mamba

Ye Lu, Jie Wang, Jianjun Gao et al.

ICCV 2025posterarXiv:2507.19852

citations

#784

Color Matching Using Hypernetwork-Based Kolmogorov-Arnold Networks

Artem Nikonorov, Georgy Perevozchikov, Andrei Korepanov et al.

ICCV 2025posterarXiv:2503.11781

citations

#785

ReasonVQA: A Multi-hop Reasoning Benchmark with Structural Knowledge for Visual Question Answering

Duong T. Tran, Trung-Kien Tran, Manfred Hauswirth et al.

ICCV 2025posterarXiv:2507.16403

citations

#786

LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering

Xiaohang Zhan, Dingming Liu

ICCV 2025posterarXiv:2508.07647

citations

#787

DEPTHOR: Depth Enhancement from a Practical Light-Weight dToF Sensor and RGB Image

Jijun Xiang, Xuan Zhu, Xianqi Wang et al.

ICCV 2025posterarXiv:2504.01596

citations

#788

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Xingsong Ye, Yongkun Du, Yunbo Tao et al.

ICCV 2025posterarXiv:2412.01137

citations

#789

CMT: A Cascade MAR with Topology Predictor for Multimodal Conditional CAD Generation

Jianyu Wu, Yizhou Wang, Xiangyu Yue et al.

ICCV 2025posterarXiv:2504.20830

citations

#790

Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion

Enyu Liu, En Yu, Sijia Chen et al.

ICCV 2025posterarXiv:2507.08555

citations

#791

GT-Loc: Unifying When and Where in Images through a Joint Embedding Space

David G. Shatwell, Ishan Rajendrakumar Dave, Swetha Sirnam et al.

ICCV 2025posterarXiv:2507.10473

citations

#792

ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition

Sanjoy Kundu, Shanmukha Vellamcheti, Sathyanarayanan Aakur

ICCV 2025posterarXiv:2504.03948

citations

#793

Hybrid-grained Feature Aggregation with Coare-to-fine Language Guidance for Self-supervised Monocular Depth Estimation

Wenyao Zhang, Hongsi Liu, Bohan Li et al.

ICCV 2025poster

citations

#794

Trust but Verify: Programmatic VLM Evaluation in the Wild

Viraj Prabhu, Senthil Purushwalkam, An Yan et al.

ICCV 2025posterarXiv:2410.13121

citations

#795

ETA: Energy-based Test-time Adaptation for Depth Completion

Younjoon Chung, Hyoungseob Park, Patrick Rim et al.

ICCV 2025posterarXiv:2508.05989

citations

#796

Supercharged One-step Text-to-Image Diffusion Models with Negative Prompts

Viet Nguyen, Anh Nguyen, Trung Dao et al.

ICCV 2025posterarXiv:2412.02687

citations

#797

InfGen: A Resolution-Agnostic Paradigm for Scalable Image Synthesis

Tao Han, Wanghan Xu, Junchao Gong et al.

ICCV 2025posterarXiv:2509.10441

citations

#798

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs

Liwei Che, Qingze T Liu, Jing Jia et al.

ICCV 2025posterarXiv:2503.07772

citations

#799

Generate, Transduct, Adapt: Iterative Transduction with VLMs

Oindrila Saha, Logan Lawrence, Grant Horn et al.

ICCV 2025posterarXiv:2501.06031

citations

#800

Differentiable Room Acoustic Rendering with Multi-View Vision Priors

Derong Jin, Ruohan Gao

ICCV 2025posterarXiv:2504.21847

citations

← Previous

1 2 3 4 5 6...14