Most Cited 2025 "open-ended benchmarks" Papers

22,274 papers found • Page 110 of 112

Filters:Most Cited 2025 open-ended benchmarks Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#21801

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

Zhuqiang Lu, Zhenfei Yin, Mengwei He et al.

ICCV 2025posterarXiv:2412.09919

#21802

Pretend Benign: A Stealthy Adversarial Attack by Exploiting Vulnerabilities in Cooperative Perception

Hongwei Lin, Dongyu Pan, Qiming Xia et al.

ICCV 2025poster

#21803

What we need is explicit controllability: Training 3D gaze estimator using only facial images

Tingwei Li, Jun Bao, Zhenzhong Kuang et al.

ICCV 2025poster

#21804

SemiVisBooster: Boosting Semi-Supervised Learning for Fine-Grained Classification through Pseudo-Label Semantic Guidance

Wenjin Zhang, Xinyu Li, Chenyang Gao et al.

ICCV 2025poster

#21805

ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

ICCV 2025posterarXiv:2503.09509

#21806

Enhancing Prompt Generation with Adaptive Refinement for Camouflaged Object Detection

Xuehan Chen, Guangyu Ren, Tianhong Dai et al.

ICCV 2025poster

#21807

Hypergraph Clustering Network with Partial Attribute Imputation

Qianqian Wang, Bowen Zhao, Zhengming Ding et al.

ICCV 2025poster

#21808

Dual-Path Temporal Decoder for End-to-End Multi-Object Tracking

Hyunseop Kim, Juheon Jeong, Hanul Kim et al.

NEURIPS 2025oral

#21809

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

Jing Wang, Rui Zhao, Ruiqin Xiong et al.

ICCV 2025poster

#21810

Learning Null Geodesics for Gravitational Lensing Rendering in General Relativity

Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.

ICCV 2025posterarXiv:2507.15775

#21811

Object-centric Video Question Answering with Visual Grounding and Referring

Haochen Wang, Qirui Chen, Cilin Yan et al.

ICCV 2025posterarXiv:2507.19599

#21812

DSO: Aligning 3D Generators with Simulation Feedback for Physical Soundness

Ruining Li, Chuanxia Zheng, Christian Rupprecht et al.

ICCV 2025highlightarXiv:2503.22677

#21813

EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds

Lu Chen, Yizhou Wang, SHIXIANG TANG et al.

ICCV 2025posterarXiv:2502.05857

#21814

Unbiased Missing-modality Multimodal Learning

Ruiting Dai, Chenxi Li, Yandong Yan et al.

ICCV 2025poster

#21815

Hybrid-Tower: Fine-grained Pseudo-query Interaction and Generation for Text-to-Video Retrieval

Bangxiang Lan, Ruobing Xie, Ruixiang Zhao et al.

ICCV 2025posterarXiv:2509.04773

#21816

LIRA: Reasoning Reconstruction via Multimodal Large Language Models

Zhen Zhou, Tong Wang, Yunkai Ma et al.

ICCV 2025poster

#21817

MEH: A Multi-Style Dataset and Toolkit for Advancing Egyptian Hieroglyph Recognition

Maksim Golyadkin, Rubanova Alexandrovna, Aleksandr Utkov et al.

ICCV 2025poster

#21818

MaskSAM: Auto-prompt SAM with Mask Classification for Volumetric Medical Image Segmentation

Bin Xie, Hao Tang, Bin Duan et al.

ICCV 2025poster

#21819

Breaking Grid Constraints: Dynamic Graph Reconstruction Network for Multi-organ Segmentation

Junhao Xiao, Yang Wei, Jingyu Wang et al.

ICCV 2025poster

#21820

Synchronizing Task Behavior: Aligning Multiple Tasks during Test-Time Training

Wooseong Jeong, Jegyeong Cho, Youngho Yoon et al.

ICCV 2025posterarXiv:2507.07778

#21821

Learning an Implicit Physics Model for Image-based Fluid Simulation

Emily Jia, Jiageng Mao, Zhiyuan Gao et al.

ICCV 2025posterarXiv:2508.08254

#21822

Player-Centric Multimodal Prompt Generation for Large Language Model Based Identity-Aware Basketball Video Captioning

Zeyu Xi, Haoying Sun, Yaofei Wu et al.

ICCV 2025posterarXiv:2507.20163

#21823

Exploiting Frequency Dynamics for Enhanced Multimodal Event-based Action Recognition

Meiqi Cao, Xiangbo Shu, Xin Jiang et al.

ICCV 2025poster

#21824

Enrich and Detect: Video Temporal Grounding with Multimodal LLMs

Shraman Pramanick, Effrosyni Mavroudi, Yale Song et al.

ICCV 2025highlightarXiv:2510.17023

#21825

First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training

Gyudong Kim, Hyukju Na, Jin Kim et al.

NEURIPS 2025poster

#21826

Region-aware Anchoring Mechanism for Efficient Referring Visual Grounding

Shuyi Ouyang, Ziwei Niu, Hongyi Wang et al.

ICCV 2025poster

#21827

Token-Efficient VLM: High-Resolution Image Understanding via Dynamic Region Proposal

Yitong Jiang, Jinwei Gu, Tianfan Xue et al.

ICCV 2025highlight

#21828

How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach

Chirui CHANG, Jiahui Liu, Zhengzhe Liu et al.

ICCV 2025posterarXiv:2406.19568

#21829

SIC: Similarity-Based Interpretable Image Classification with Neural Networks

Tom Nuno Wolf, Emre Kavak, Fabian Bongratz et al.

ICCV 2025posterarXiv:2501.17328

#21830

Teaching AI the Anatomy Behind the Scan: Addressing Anatomical Flaws in Medical Image Segmentation with Learnable Prior

Young Seok Jeon, Hongfei Yang, Huazhu Fu et al.

ICCV 2025posterarXiv:2403.18878

#21831

Scaling Tumor Segmentation: Best Lessons from Real and Synthetic Data

Qi Chen, Xinze Zhou, Chen Liu et al.

ICCV 2025posterarXiv:2510.14831

#21832

Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration

Ting Lei, Shaofeng Yin, Qingchao Chen et al.

ICCV 2025posterarXiv:2508.03207

#21833

What Moves the Eyes: Doubling Mechanistic Model Performance Using Deep Networks to Discover and Test Cognitive Hypotheses

Federico D'Agostino, Lisa Schwetlick, Matthias Bethge et al.

NEURIPS 2025oral

#21834

LawDIS: Language-Window-based Controllable Dichotomous Image Segmentation

Xinyu Yan, Meijun Sun, Ge-Peng Ji et al.

ICCV 2025posterarXiv:2508.01152

#21835

VideoMiner: Iteratively Grounding Key Frames of Hour-Long Videos via Tree-based Group Relative Policy Optimization

Xinye Cao, Hongcan Guo, Jiawen Qian et al.

ICCV 2025posterarXiv:2510.06040

#21836

WIPES: Wavelet-based Visual Primitives

Wenhao Zhang, Hao Zhu, Delong Wu et al.

ICCV 2025posterarXiv:2508.12615

#21837

MambaML: Exploring State Space Models for Multi-Label Image Classification

Xuelin Zhu, Jian liu, Jiuxin Cao et al.

ICCV 2025poster

#21838

SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting

Shuaiting Li, Juncan Deng, Chengxuan Wang et al.

ICCV 2025posterarXiv:2503.08668

#21839

Vision-Language Neural Graph Featurization for Extracting Retinal Lesions

Taimur Hassan, Anabia Sohail, Muzammal Naseer et al.

ICCV 2025poster

#21840

Flow-MIL: Constructing Highly-expressive Latent Feature Space For Whole Slide Image Classification Using Normalizing Flow

Yingfan MA, Bohan An, Ao Shen et al.

ICCV 2025poster

#21841

MotionBind: Multi-Modal Human Motion Alignment for Retrieval, Recognition, and Generation

Kaleab Kinfu, Rene Vidal

NEURIPS 2025oral

#21842

CoSMIC: Continual Self-supervised Learning for Multi-Domain Medical Imaging via Conditional Mutual Information Maximization

Yihang Liu, Ying Wen, Longzhen Yang et al.

ICCV 2025poster

#21843

Towards Robustness of Person Search against Corruptions

Woojung Son, Yoonki Cho, Guoyuan An et al.

ICCV 2025poster

#21844

VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification

Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.

ICCV 2025poster

#21845

SEAL: Semantic Aware Image Watermarking

Kasra Arabi, R. Teal Witter, Chinmay Hegde et al.

ICCV 2025posterarXiv:2503.12172

#21846

ArchiSet: Benchmarking Editable and Consistent Single-View 3D Reconstruction of Buildings with Specific Window-to-Wall Ratios

Jun Yin, Pengyu Zeng, Licheng Shen et al.

ICCV 2025poster

#21847

UINavBench: A Framework for Comprehensive Evaluation of Interactive Digital Agents

Harsh Agrawal, Eldon Schoop, Xinlei Pan et al.

ICCV 2025poster

#21848

Unsupervised Identification of Protein Compositions and Conformations via Implicit Content-Transformation Disentanglement

Mostofa Rafid Uddin, Jana Armouti, Min Xu

ICCV 2025poster

#21849

How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation?

Yujian Lee, Peng Gao, Yongqi Xu et al.

ICCV 2025posterarXiv:2601.08133

#21850

Splat-based 3D Scene Reconstruction with Extreme Motion-blur

Hyeonjoong Jang, Dongyoung Choi, Donggun Kim et al.

ICCV 2025poster

#21851

Diffusion Curriculum: Synthetic-to-Real Data Curriculum via Image-Guided Diffusion

Yijun Liang, Shweta Bhardwaj, Tianyi Zhou

ICCV 2025posterarXiv:2410.13674

#21852

Unsupervised Histopathological Image Semantic Segmentation with Overlapping Patches Consistency Constraint

Wentian Cai, Weizhao Weng, Zihao Huang et al.

ICCV 2025poster

#21853

VISO: Accelerating In-orbit Object Detection with Language-Guided Mask Learning and Sparse Inference

Meiqi Wang, Han Qiu

ICCV 2025poster

#21854

Advancing Textual Prompt Learning with Anchored Attributes

Zheng Li, Yibing Song, Ming-Ming Cheng et al.

ICCV 2025posterarXiv:2412.09442

#21855

FIND: Few-Shot Anomaly Inspection with Normal-Only Multi-Modal Data

YITING LI, Fayao Liu, Jingyi Liao et al.

ICCV 2025poster

#21856

AR-1-to-3: Single Image to Consistent 3D Object via Next-View Prediction

Xuying Zhang, Yupeng Zhou, Kai Wang et al.

ICCV 2025poster

#21857

DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation

Jihun Kim, Hoyong Kwon, Hyeokjun Kweon et al.

ICCV 2025posterarXiv:2506.23104

#21858

Dual-Rate Dynamic Teacher for Source-Free Domain Adaptive Object Detection

Qi He, Xiao Wu, Jun-Yan He et al.

ICCV 2025poster

#21859

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

Yucheng Suo, Fan Ma, Linchao Zhu et al.

ICCV 2025posterarXiv:2503.20472

#21860

OV3D-CG: Open-vocabulary 3D Instance Segmentation with Contextual Guidance

Mingquan Zhou, Chen He, Ruiping Wang et al.

ICCV 2025poster

#21861

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang et al.

ICCV 2025posterarXiv:2410.23287

#21862

Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data

Hancheng Min, Zhihui Zhu, Rene Vidal

NEURIPS 2025posterarXiv:2510.21078

#21863

Rethinking Discrete Tokens: Treating Them as Conditions for Continuous Autoregressive Image Synthesis

Peng Zheng, Junke Wang, Yi Chang et al.

ICCV 2025posterarXiv:2507.01756

#21864

Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning

Lizhen Xu, Xiuxiu Bai, Xiaojun Jia et al.

ICCV 2025posterarXiv:2503.08101

#21865

CogCM: Cognition-Inspired Contextual Modeling for Audio-Visual Speech Enhancement

Feixiang Wang, Shuang Yang, Shiguang Shan et al.

ICCV 2025poster

#21866

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Zhisheng Zhong, Chengyao Wang, Yuqi Liu et al.

ICCV 2025posterarXiv:2412.09501

#21867

CalliReader: Contextualizing Chinese Calligraphy via an Embedding-Aligned Vision-Language Model

Yuxuan Luo, Jiaqi Tang, Chenyi Huang et al.

ICCV 2025posterarXiv:2503.06472

#21868

EDFFDNet: Towards Accurate and Efficient Unsupervised Multi-Grid Image Registration

Haokai Zhu, Bo Qu, Si-Yuan Cao et al.

ICCV 2025posterarXiv:2509.07662

#21869

Enhancing Mamba Decoder with Bidirectional Interaction in Multi-Task Dense Prediction

Mang Cao, Sanping Zhou, Yizhe Li et al.

ICCV 2025posterarXiv:2508.20376

#21870

Leveraging Debiased Cross-modal Attention Maps and Code-based Reasoning for Zero-shot Referring Expression Comprehension

Juntao Chen, Wen Shen, Zhihua Wei et al.

ICCV 2025poster

#21871

UST-SSM: Unified Spatio-Temporal State Space Models for Point Cloud Video Modeling

Peiming Li, Ziyi Wang, Yulin Yuan et al.

ICCV 2025posterarXiv:2508.14604

#21872

SITE: towards Spatial Intelligence Thorough Evaluation

Wenqi Wang, Reuben Tan, Pengyue Zhu et al.

ICCV 2025posterarXiv:2505.05456

#21873

SHIFT: Smoothing Hallucinations by Information Flow Tuning for Multimodal Large Language Models

Sudong Wang, Yunjian Zhang, Yao Zhu et al.

ICCV 2025poster

#21874

Similarity Memory Prior is All You Need for Medical Image Segmentation

Hao Tang, Zhiqing Guo, Liejun Wang et al.

ICCV 2025highlightarXiv:2507.00585

#21875

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches

Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif et al.

ICCV 2025posterarXiv:2311.12084

#21876

Debiasing Trace Guidance: Top-down Trace Distillation and Bottom-up Velocity Alignment for Unsupervised Anomaly Detection

Xingjian Wang, Li Chai, Jiming Chen

ICCV 2025

#21877

Conformal Prediction for Zero-Shot Models

Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz

CVPR 2025posterarXiv:2505.24693

#21878

Automated Red Teaming for Text-to-Image Models through Feedback-Guided Prompt Iteration with Vision-Language Models

Wei Xu, Kangjie Chen, Jiawei Qiu et al.

ICCV 2025poster

#21879

Convergence Rates for Gradient Descent on the Edge of Stability for Overparametrised Least Squares

Lachlan MacDonald, Hancheng Min, Leandro Palma et al.

NEURIPS 2025posterarXiv:2510.17506

#21880

Enhancing Spatial Reasoning in Multimodal Large Language Models through Reasoning-based Segmentation

Zhenhua Ning, Zhuotao Tian, Shaoshuai Shi et al.

ICCV 2025posterarXiv:2506.23120

#21881

OVG-HQ: Online Video Grounding with Hybrid-modal Queries

Runhao Zeng, Jiaqi Mao, Minghao Lai et al.

ICCV 2025posterarXiv:2508.11903

#21882

Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring

Yufei Zhan, Shurong Zheng, Yousong Zhu et al.

ICCV 2025posterarXiv:2403.09333

#21883

BézierGS: Dynamic Urban Scene Reconstruction with Bézier Curve Gaussian Splatting

Zipei Ma, Junzhe Jiang, Yurui Chen et al.

ICCV 2025posterarXiv:2506.22099

#21884

SECA: Semantically Equivalent and Coherent Attacks for Eliciting LLM Hallucinations

Buyun Liang, Liangzu Peng, Jinqi Luo et al.

NEURIPS 2025posterarXiv:2510.04398

#21885

CLIPSym: Delving into Symmetry Detection with CLIP

Tinghan Yang, Md Ashiqur Rahman, Raymond A. Yeh

ICCV 2025posterarXiv:2508.14197

#21886

HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?

Yusen Zhang, Wenliang Zheng, Aashrith Madasu et al.

ICCV 2025posterarXiv:2504.18406

#21887

HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics

Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen et al.

ICCV 2025posterarXiv:2408.17443

#21888

Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting

Hengyu Meng, Duotun Wang, Zhijing Shao et al.

ICCV 2025posterarXiv:2502.20045

#21889

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

Yuzhang Shang, Mu Cai, Bingxin Xu et al.

ICCV 2025posterarXiv:2403.15388

#21890

Towards Comprehensive Lecture Slides Understanding: Large-scale Dataset and Effective Method

Enming Zhang, Yuzhe Li, Yuliang Liu et al.

ICCV 2025poster

#21891

A Unified Interpretation of Training-Time Out-of-Distribution Detection

Xu Cheng, Xin Jiang, Zechao Li

ICCV 2025highlight

#21892

Attention on the Sphere

Boris Bonev, Max Rietmann, Andrea Paris et al.

NEURIPS 2025posterarXiv:2505.11157

#21893

Federated Domain Generalization with Domain-specific Soft Prompts Generation

Jianhan Wu, Xiaoyang Qu, Zhangcheng Huang et al.

ICCV 2025posterarXiv:2509.20807

#21894

Removing Out-of-Focus Reflective Flares via Color Alignment

Fengbo Lan, Chang Wen Chen

ICCV 2025poster

#21895

ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection

Yingjian Chen, Lei Zhang, Yakun Niu

ICCV 2025posterarXiv:2408.13697

#21896

Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration

Mark Endo, Xiaohan Wang, Serena Yeung-Levy

ICCV 2025posterarXiv:2412.13180

#21897

Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures

Guoxing Sun, Rishabh Dabral, Heming Zhu et al.

CVPR 2025highlightarXiv:2412.13183

#21898

Mamba-3VL: Taming State Space Model for 3D Vision Language Learning

Yuan Wang, Yuxin Chen, Zhongang Qi et al.

ICCV 2025poster

#21899

Embodied Representation Alignment with Mirror Neurons

Wentao Zhu, Zhining Zhang, Yuwei Ren et al.

ICCV 2025posterarXiv:2509.21136

#21900

DIH-CLIP: Unleashing the Diversity of Multi-Head Self-Attention for Training-Free Open-Vocabulary Semantic Segmentation

Songsong Duan, Xi Yang, Nannan Wang

ICCV 2025poster

#21901

Selective Contrastive Learning for Weakly Supervised Affordance Grounding

WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

ICCV 2025posterarXiv:2508.07877

#21902

DASH: Detection and Assessment of Systematic Hallucinations of VLMs

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

ICCV 2025posterarXiv:2503.23573

#21903

M2EIT: Multi-Domain Mixture of Experts for Robust Neural Inertial Tracking

Yan Li, Yang Xu, Changhao Chen et al.

ICCV 2025poster

#21904

MobileViCLIP: An Efficient Video-Text Model for Mobile Devices

Min Yang, Zihan Jia, Zhilin Dai et al.

ICCV 2025posterarXiv:2508.07312

#21905

scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling

Joel Dapello, Marcel Nassar, Ridvan Eksi et al.

NEURIPS 2025poster

#21906

No More Sibling Rivalry: Debiasing Human-Object Interaction Detection

Bin Yang, Yulin Zhang, Hong-Yu Zhou et al.

ICCV 2025posterarXiv:2509.00760

#21907

Memory-Efficient 4-bit Preconditioned Stochastic Optimization

Jingyang Li, Kuangyu Ding, Kim-chuan Toh et al.

ICCV 2025posterarXiv:2412.10663

#21908

Prompt-driven Transferable Adversarial Attack on Person Re-Identification with Attribute-aware Textual Inversion

Yuan Bian, Min Liu, Yunqi Yi et al.

ICCV 2025posterarXiv:2502.19697

#21909

EVOLVE: Event-Guided Deformable Feature Transfer and Dual-Memory Refinement for Low-Light Video Object Segmentation

Jong Hyeon Baek, Jiwon oh, Yeong Jun Koh

ICCV 2025poster

#21910

MATE: Motion-Augmented Temporal Consistency for Event-based Point Tracking

Han Han, Wei Zhai, Yang Cao et al.

ICCV 2025posterarXiv:2412.01300

#21911

Asynchronous Event Error-Minimizing Noise for Safeguarding Event Dataset

Ruofei WANG, Peiqi Duan, Boxin Shi et al.

ICCV 2025highlightarXiv:2507.05728

#21912

AG2aussian: Anchor-Graph Structured Gaussian Splatting for Instance-Level 3D Scene Understanding and Editing

Zhaonan Wang, Manyi Li, Changhe Tu

ICCV 2025poster

#21913

Vector Contrastive Learning For Pixel-Wise Pretraining In Medical Vision

Yuting He, Shuo Li

ICCV 2025posterarXiv:2506.20850

#21914

InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior

Minghao Wen, Shengjie Wu, Kangkan Wang et al.

ICCV 2025posterarXiv:2507.04961

#21915

Anomaly Detection of Integrated Circuits Package Substrates Using the Large Vision Model SAIC: Dataset Construction, Methodology, and Application

Ruiyun Yu, Bingyang Guo, Haoyuan Li

ICCV 2025poster

#21916

Learnable Retrieval Enhanced Visual-Text Alignment and Fusion for Radiology Report Generation

Qin Zhou, Guoyan Liang, Xindi Li et al.

ICCV 2025posterarXiv:2507.07568

#21917

Benchmarking Multimodal Large Language Models Against Image Corruptions

Xinkuan Qiu, Meina Kan, Yongbin Zhou et al.

ICCV 2025poster

#21918

Temporal-aware Query Routing for Real-time Video Instance Segmentation

Zesen Cheng, Kehan Li, Yian Zhao et al.

ICCV 2025poster

#21919

Weak-to-Strong Generalization under Distribution Shifts

Myeongho Jeon, Jan Sobotka, Suhwan Choi et al.

NEURIPS 2025posterarXiv:2510.21332

#21920

RvLLM: LLM Runtime Verification with Domain Knowledge

Yedi Zhang, Sun Emma, Annabelle En et al.

NEURIPS 2025posterarXiv:2505.18585

#21921

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Xingyu Liu, Gu Wang, Ruida Zhang et al.

CVPR 2025posterarXiv:2411.16106

#21922

Dynamic Dictionary Learning for Remote Sensing Image Segmentation

Xuechao Zou, Yue Li, Shun Zhang et al.

ICCV 2025posterarXiv:2503.06683

#21923

Efficient Fine-Tuning of Large Models via Nested Low-Rank Adaptation

Lujun Li, Cheng Lin, Dezhi Li et al.

ICCV 2025poster

#21924

Dual-level Prototype Learning for Composite Degraded Image Restoration

Zhongze Wang, Haitao Zhao, Lujian Yao et al.

ICCV 2025poster

#21925

HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets and CLIP Models

ZHIXIANG WEI, Guangting Wang, Xiaoxiao Ma et al.

ICCV 2025posterarXiv:2507.22431

#21926

Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals

Linda Zeng, Rithwik Gupta, Divij Motwani et al.

NEURIPS 2025posterarXiv:2502.16101

#21927

AuraFusion360: Augmented Unseen Region Alignment for Reference-based 360° Unbounded Scene Inpainting

Chung-Ho Wu, Yang-Jung Chen, Ying-Huan Chen et al.

CVPR 2025posterarXiv:2502.05176

#21928

Deterministic Object Pose Confidence Region Estimation

Jinghao Wang, Zhang Li, Zi Wang et al.

ICCV 2025posterarXiv:2506.22720

#21929

Is CLIP ideal? No. Can we fix it? Yes!

Raphaela Kang, Yue Song, Georgia Gkioxari et al.

ICCV 2025posterarXiv:2503.08723

#21930

Learning Beyond Still Frames: Scaling Vision-Language Models with Video

Yiyuan Zhang, Handong Li, Jing Liu et al.

ICCV 2025poster

#21931

Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation

Shengfang ZHAI, Jiajun Li, Yue Liu et al.

ICCV 2025highlightarXiv:2503.06453

#21932

Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning

Liwei Luo, Shuaitengyuan Li, Dongwei Ren et al.

ICCV 2025posterarXiv:2511.03245

#21933

Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment

Shi-Chen Zhang, Yunheng Li, Yu-Huan Wu et al.

ICCV 2025posterarXiv:2508.08811

#21934

ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation

Qizhen Lan, Qing Tian

ICCV 2025posterarXiv:2503.06307

#21935

GReg: Geometry-Aware Region Refinement for Sign Language Video Generation

Tongkai Shi, Lianyu Hu, Fanhua Shang et al.

ICCV 2025poster

#21936

Motion4D: Learning 3D-Consistent Motion and Semantics for 4D Scene Understanding

Haoran Zhou, Gim Hee Lee

NEURIPS 2025oralarXiv:2512.03601

#21937

Unsupervised Part Discovery via Descriptor-Based Masked Image Restoration with Optimized Constraints

Jiahao Xia, Yike Wu, Wenjian Huang et al.

ICCV 2025posterarXiv:2507.11985

#21938

NETracer: A Topology-Aware Iterative Tracing Approach for Tubular Structure Extraction

Chao Liu, Yangbo Jiang, Nenggan Zheng

ICCV 2025poster

#21939

Interpretable point cloud classification using multiple instance learning

Matt De Vries, Reed Naidoo, Olga Fourkioti et al.

ICCV 2025highlight

#21940

MotionCtrl: A Real-time Controllable Vision-Language-Motion Model

Bin Cao, Sipeng Zheng, Ye Wang et al.

ICCV 2025poster

#21941

UIPro: Unleashing Superior Interaction Capability For GUI Agents

Hongxin Li, Jingran Su, Jingfan CHEN et al.

ICCV 2025posterarXiv:2509.17328

#21942

SALAD -- Semantics-Aware Logical Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ICCV 2025posterarXiv:2509.02101

#21943

FineMotion: A Dataset and Benchmark with both Spatial and Temporal Annotation for Fine-grained Motion Generation and Editing

Bizhu Wu, Jinheng Xie, Meidan Ding et al.

ICCV 2025posterarXiv:2507.19850

#21944

Controllable Latent Space Augmentation for Digital Pathology

Sofiène Boutaj, Marin Scalbert, Pierre Marza et al.

ICCV 2025posterarXiv:2508.14588

#21945

Advancing Visual Large Language Model for Multi-granular Versatile Perception

Wentao Xiang, Haoxian Tan, Cong Wei et al.

ICCV 2025posterarXiv:2507.16213

#21946

Beyond Single Images: Retrieval Self-Augmented Unsupervised Camouflaged Object Detection

Ji Du, Xin WANG, Fangwei Hao et al.

ICCV 2025posterarXiv:2510.18437

#21947

Pseudo-Riemannian Graph Transformer

Viet Quan Le, Cuong Viet Ta

NEURIPS 2025poster

#21948

Modeling Saliency Dataset Bias

Matthias Kümmerer, Harneet Singh Khanuja, Matthias Bethge

ICCV 2025highlightarXiv:2505.10169

#21949

VLR-Driver: Large Vision-Language-Reasoning Models for Embodied Autonomous Driving

Fanjie Kong, Yitong Li, Weihuang Chen et al.

ICCV 2025poster

#21950

Vid-Group: Temporal Video Grounding Pretraining from Unlabeled Videos in the Wild

Peijun Bao, Chenqi Kong, SIYUAN YANG et al.

ICCV 2025poster

#21951

CARIM: Caption-Based Autonomous Driving Scene Retrieval via Inclusive Text Matching

Minjoo Ki, Dae Jung Kim, Kisung Kim et al.

ICCV 2025poster

#21952

Knowledge Transfer from Interaction Learning

Yilin Gao, Kangyi Chen, Zhongxing Peng et al.

ICCV 2025posterarXiv:2509.18733

#21953

WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction

Richard Liu, Daniel Fu, Noah Tan et al.

ICCV 2025posterarXiv:2505.04813

#21954

Temperature in Cosine-based Softmax Loss

Takumi Kobayashi

ICCV 2025poster

#21955

Transformer Key-Value Memories Are Nearly as Interpretable as Sparse Autoencoders

Mengyu Ye, Jun Suzuki, Tatsuro Inaba et al.

NEURIPS 2025posterarXiv:2510.22332

#21956

Multi-modal Segment Anything Model for Camouflaged Scene Segmentation

Guangyu Ren, Hengyan Liu, Michalis Lazarou et al.

ICCV 2025poster

#21957

WeaveSeg: Iterative Contrast-weaving and Spectral Feature-refining for Nuclei Instance Segmentation

Jiajia Li, Huisi Wu, Jing Qin

ICCV 2025highlight

#21958

DisTime: Distribution-based Time Representation for Video Large Language Models

yingsen zeng, Zepeng Huang, Yujie Zhong et al.

ICCV 2025posterarXiv:2505.24329

#21959

Synthesizing Near-Boundary OOD Samples for Out-of-Distribution Detection

Jinglun Li, Kaixun Jiang, Zhaoyu Chen et al.

ICCV 2025highlightarXiv:2507.10225

#21960

Cassic: Towards Content-Adaptive State-Space Models for Learned Image Compression

Shiyu Qin, Jinpeng Wang, Yimin Zhou et al.

ICCV 2025poster

#21961

SpectralAR: Spectral Autoregressive Visual Generation

Yuanhui Huang, Weiliang Chen, Wenzhao Zheng et al.

ICCV 2025posterarXiv:2506.10962

#21962

Bridging the Gap between Brain and Machine in Interpreting Visual Semantics: Towards Self-adaptive Brain-to-Text Decoding

Jiaxuan Chen, Yu Qi, Yueming Wang et al.

ICCV 2025poster

#21963

Boosting Adversarial Transferability via Negative Hessian Trace Regularization

Yunfei Long, Zilin Tian, Liguo Zhang et al.

ICCV 2025poster

#21964

AcZeroTS: Active Learning for Zero-shot Tissue Segmentation in Pathology Images

Jiao Tang, Junjie Zhou, Bo Qian et al.

ICCV 2025poster

#21965

OneGT: One-Shot Geometry-Texture Neural Rendering for Head Avatars

Jinshu Chen, Bingchuan Li, Fan Zhang et al.

ICCV 2025poster

#21966

On the sample complexity of semi-supervised multi-objective learning

Tobias Wegel, Geelon So, Junhyung Park et al.

NEURIPS 2025spotlightarXiv:2508.17152

#21967

Unsupervised Visible-Infrared Person Re-identification under Unpaired Settings

Haoyu Yao, Bin Yang, Wenke Huang et al.

ICCV 2025poster

#21968

Adaptive Prompt Learning via Gaussian Outlier Synthesis for Out-of-distribution Detection

Yongkang Zhang, Dongyu She, Zhong Zhou

ICCV 2025poster

#21969

Auto-Controlled Image Perception in MLLMs via Visual Perception Tokens

Runpeng Yu, Xinyin Ma, Xinchao Wang

ICCV 2025poster

#21970

Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions

ZiYi Dong, Chengxing Zhou, Weijian Deng et al.

ICCV 2025posterarXiv:2504.21292

#21971

Ultra-Precision 6DoF Pose Estimation Using 2-D Interpolated Discrete Fourier Transform

Guowei Shi, Zian Mao, Peisen Huang

ICCV 2025poster

#21972

Prototypes are Balanced Units for Efficient and Effective Partially Relevant Video Retrieval

WonJun Moon, Cheol-Ho Cho, Woojin Jun et al.

ICCV 2025posterarXiv:2504.13035

#21973

Information-theoretic Generalization Analysis for VQ-VAEs: A Role of Latent Variables

Futoshi Futami, Masahiro Fujisawa

NEURIPS 2025posterarXiv:2505.19470

#21974

A Differentiable Wave Optics Model for End-to-End Computational Imaging System Optimization

Chi-Jui Ho, Yash Belhe, Steve Rotenberg et al.

ICCV 2025posterarXiv:2412.09774

#21975

Exploring Probabilistic Modeling Beyond Domain Generalization for Semantic Segmentation

I-Hsiang Chen, Hua-En Chang, Wei-Ting Chen et al.

ICCV 2025posterarXiv:2507.21367

#21976

AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation

Haifeng Zhong, Fan Tang, Zhuo Chen et al.

ICCV 2025poster

#21977

DisCo: Towards Distinct and Coherent Visual Encapsulation in Video MLLMs

JIAHE ZHAO, rongkun Zheng, Yi Wang et al.

ICCV 2025posterarXiv:2507.10302

#21978

OCK: Unsupervised Dynamic Video Prediction with Object-Centric Kinematics

YeonJi Song, Jaein Kim, Suhyung Choi et al.

ICCV 2025posterarXiv:2404.18423

#21979

Contextual Dynamic Pricing with Heterogeneous Buyers

Thodoris Lykouris, Sloan Nietert, Princewill Okoroafor et al.

NEURIPS 2025posterarXiv:2512.09513

#21980

Prompt Guidance and Human Proximal Perception for HOT Prediction with Regional Joint Loss

Yuxiao Wang, Yu Lei, Zhenao WEI et al.

ICCV 2025posterarXiv:2507.01630

#21981

RA-BUSSeg: Relation-aware Semi-supervised Breast Ultrasound Image Segmentation via Adjacent Propagation and Cross-layer Alignment

Wanting ZHANG, Zhenhui Ding, Guilian Chen et al.

ICCV 2025poster

#21982

Hierarchical Event Memory for Accurate and Low-latency Online Video Temporal Grounding

Minghang Zheng, Yuxin Peng, Benyuan Sun et al.

ICCV 2025posterarXiv:2508.04546

#21983

Coupling the Generator with Teacher for Effective Data-Free Knowledge Distillation

Xu Chen, Yang Li, Yahong Han et al.

ICCV 2025poster

#21984

Towards a Universal Image Degradation Model via Content-Degradation Disentanglement

Wenbo Yang, Zhongling Wang, Zhou Wang

ICCV 2025posterarXiv:2505.12860

#21985

Intra-view and Inter-view Correlation Guided Multi-view Novel Class Discovery

Xinhang Wan, Jiyuan Liu, Qian Qu et al.

ICCV 2025posterarXiv:2507.12029

#21986

HUST: High-Fidelity Unbiased Skin Tone Estimation via Texture Quantization

Zimin Ran, Xingyu Ren, Xiang An et al.

ICCV 2025poster

#21987

Few-Shot Pattern Detection via Template Matching and Regression

Eunchan Jo, Dahyun Kang, Sanghyun Kim et al.

ICCV 2025highlightarXiv:2508.17636

#21988

Know Your Attention Maps: Class-specific Token Masking for Weakly Supervised Semantic Segmentation

Joëlle Hanna, Damian Borth

ICCV 2025posterarXiv:2507.06848

#21989

Provably Efficient RL under Episode-Wise Safety in Constrained MDPs with Linear Function Approximation

Toshinori Kitamura, Arnob Ghosh, Tadashi Kozuno et al.

NEURIPS 2025spotlightarXiv:2502.10138

#21990

Structure-Guided Diffusion Models for High-Fidelity Portrait Shadow Removal

wanchang Yu, Qing Zhang, Rongjia Zheng et al.

ICCV 2025posterarXiv:2507.04692

#21991

FreeDNA: Endowing Domain Adaptation of Diffusion-Based Dense Prediction with Training-Free Domain Noise Alignment

Hang Xu, Jie Huang, Linjiang Huang et al.

ICCV 2025posterarXiv:2506.22509

#21992

ProbMED: A Probabilistic Framework for Medical Multimodal Binding

Yuan Gao, Sangwook Kim, Jianzhong You et al.

ICCV 2025posterarXiv:2509.25711

#21993

DecAD: Decoupling Anomalies in Latent Space for Multi-Class Unsupervised Anomaly Detection

Xiaolei Wang, Xiaoyang Wang, Huihui Bai et al.

ICCV 2025poster

#21994

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

Yingyu Liang, Zhizhou Sha, Zhenmei Shi et al.

ICCV 2025posterarXiv:2405.16418

#21995

SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction

Enrico Pallotta, Sina Mokhtarzadeh Azar, Shuai Li et al.

CVPR 2025posterarXiv:2503.18933

#21996

MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs

Jiawei Mao, Yuhan Wang, Yucheng Tang et al.

ICCV 2025posterarXiv:2504.06897

#21997

FDPT: Federated Discrete Prompt Tuning for Black-Box Visual-Language Models

Jiaqi Wu, Simin Chen, Jing Tang et al.

ICCV 2025poster

#21998

STDDNet: Harnessing Mamba for Video Polyp Segmentation via Spatial-aligned Temporal Modeling and Discriminative Dynamic Representation Learning

Guilian Chen, Huisi Wu, Jing Qin

ICCV 2025poster

#21999

CATP-LLM: Empowering Large Language Models for Cost-Aware Tool Planning

Duo Wu, Jinghe Wang, Yuan Meng et al.

ICCV 2025posterarXiv:2411.16313

#22000

Dynamic Group Detection using VLM-augmented Temporal Groupness Graph

Kaname Yokoyama, Chihiro Nakatani, Norimichi Ukita

ICCV 2025posterarXiv:2509.04758

← Previous

1...108 109 110 111 112