2025 Papers

21,856 papers found • Page 425 of 438

Filters:2025 Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

Visual-Instructed Degradation Diffusion for All-in-One Image Restoration

Haina Qin, Wenyang Luo, Zewen Chen et al.

CVPR 2025posterarXiv:2506.16960

citations

Visual Instruction Bottleneck Tuning

Changdae Oh, Jiatong Li, Shawn Im et al.

NeurIPS 2025posterarXiv:2505.13946

citations

Visual Intention Grounding for Egocentric Assistants

Pengzhan Sun, Junbin Xiao, Tze Ho Elden Tse et al.

ICCV 2025posterarXiv:2504.13621

citations

Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests

Fitim Abdullahu, Helmut Grabner

ICCV 2025posterarXiv:2510.13316

Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting

Anand Bhattad, Konpat Preechakul, Alexei Efros

NeurIPS 2025posterarXiv:2503.21770

citations

VisualLens: Personalization through Task-Agnostic Visual History

Wang Bill Zhu, Deqing Fu, Kai Sun et al.

NeurIPS 2025posterarXiv:2411.16034

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.06774

Visually Consistent Hierarchical Image Classification

Seulki Park, Youren Zhang, Stella Yu et al.

ICLR 2025posterarXiv:2406.11608

citations

Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models

Donghoon Kim, Minji Bae, Kyuhong Shim et al.

ICLR 2025posterarXiv:2505.08622

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.

ICCV 2025posterarXiv:2412.00622

citations

Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning

Minheng Ni, YuTao Fan, Lei Zhang et al.

ICLR 2025posterarXiv:2410.03321

citations

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

Zhen Zeng, Leijiang Gu, Xun Yang et al.

ICCV 2025posterarXiv:2411.12790

citations

Visual Persona: Foundation Model for Full-Body Human Customization

Jisu Nam, Soowon Son, Zhan Xu et al.

CVPR 2025posterarXiv:2503.15406

citations

Visual Perturbation for Text-Based Person Search

Pengcheng Zhang, Xiaohan Yu, Xiao Bai et al.

AAAI 2025paper

VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning

Yichao Liang, Nishanth Kumar, Hao Tang et al.

ICLR 2025posterarXiv:2410.23156

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025posterarXiv:2504.14335

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

Can Jin, Tianjin Huang, Yihua Zhang et al.

AAAI 2025paperarXiv:2312.01397

VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank

Tianhe Wu, Jian Zou, Jie Liang et al.

NeurIPS 2025spotlightarXiv:2505.14460

citations

Visual Reinforcement Learning with Residual Action

Zhenxian Liu, Peixi Peng, Yonghong Tian

AAAI 2025paper

citations

Visual Relation Diffusion for Human-Object Interaction Detection

Ping Cao, Yepeng Tang, Chunjie Zhang et al.

ICCV 2025poster

citations

Visual Representation Learning through Causal Intervention for Controllable Image Editing

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025highlight

Visual-RFT: Visual Reinforcement Fine-Tuning

Ziyu Liu, Zeyi Sun, Yuhang Zang et al.

ICCV 2025posterarXiv:2503.01785

347

citations

Visual Structures Help Visual Reasoning: Addressing the Binding Problem in LVLMs

Amirmohammad Izadi, Mohammadali Banayeeanzade, Fatemeh Askari et al.

NeurIPS 2025poster

citations

Visual Surface Wave Elastography: Revealing Subsurface Physical Properties via Visible Surface Waves

Alexander Ogren, Berthy Feng, Jihoon Ahn et al.

ICCV 2025posterarXiv:2507.09207

Visual Sync: Multi‑Camera Synchronization via Cross‑View Object Motion

Shaowei Liu, David Yao, Saurabh Gupta et al.

NeurIPS 2025posterarXiv:2512.02017

Visual Test-time Scaling for GUI Agent Grounding

Tiange Luo, Lajanugen Logeswaran, Justin Johnson et al.

ICCV 2025highlightarXiv:2505.00684

citations

Visual Textualization for Image Prompted Object Detection

Yongjian Wu, Yang Zhou, Jiya Saiyin et al.

ICCV 2025posterarXiv:2506.23785

Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought

ZIhui Cheng, Qiguang Chen, Xiao Xu et al.

NeurIPS 2025posterarXiv:2505.15510

Visuo-Tactile Feedback with Hand Outline Styles for Modulating Affective Roughness Perception

Minju Baeck, Yoonseok Shin, Dooyoung Kim et al.

ISMAR 2025paperarXiv:2508.13504

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Chaoyou Fu, Haojia Lin, Xiong Wang et al.

NeurIPS 2025spotlightarXiv:2501.01957

130

citations

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

Zuwei Long, Yunhang Shen, Chaoyou Fu et al.

NeurIPS 2025poster

citations

ViTally Consistent: Scaling Biological Representation Learning for Cell Microscopy

Kian Kenyon-Dean, Zitong Jerry Wang, John Urbanik et al.

ICML 2025posterarXiv:2411.02572

VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow

Ada Görgün, Bernt Schiele, Jonas Fischer

ICCV 2025posterarXiv:2503.22399

citations

VITED: Video Temporal Evidence Distillation

Yujie Lu, Yale Song, Lorenzo Torresani et al.

CVPR 2025posterarXiv:2503.12855

citations

ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers

Hanwen Cao, Haobo Lu, Xiaosen Wang et al.

ICCV 2025posterarXiv:2508.12384

citations

ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models

Guoyizhe Wei, Rama Chellappa

ICCV 2025posterarXiv:2504.00037

citations

VITRIX-CLIPIN: Enhancing Fine-Grained Visual Understanding in CLIP via Instruction-Editing Data and Long Captions

Ziteng Wang, Siqi Yang, Limeng Qiao et al.

NeurIPS 2025posterarXiv:2508.02329

VITRIX-UniViTAR: Unified Vision Transformer with Native Resolution

Limeng Qiao, Yiyang Gan, Bairui Wang et al.

NeurIPS 2025oral

ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads

Yifan Li, Xin Li, Tianqin Li et al.

ICCV 2025posterarXiv:2506.03433

ViUniT: Visual Unit Tests for More Robust Visual Programming

Artemis Panagopoulou, Honglu Zhou, silvio savarese et al.

CVPR 2025posterarXiv:2412.08859

citations

Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting

Jiaxin Huang, Sheng Miao, Bangbang Yang et al.

ICCV 2025posterarXiv:2504.11092

citations

VividFace: A Robost and High-Fidelity Video Face Swapping Framework

Hao Shao, Shulun Wang, Yang Zhou et al.

NeurIPS 2025oral

VL2Lite: Task-Specific Knowledge Distillation from Large Vision-Language Models to Lightweight Networks

Jinseong Jang, Chunfei Ma, Byeongwon Lee

CVPR 2025poster

VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks

shiduo zhang, Zhe Xu, Peiju Liu et al.

ICCV 2025posterarXiv:2412.18194

VLA-Cache: Efficient Vision-Language-Action Manipulation via Adaptive Token Caching

Siyu Xu, Yunke Wang, Chenghao Xia et al.

NeurIPS 2025oralarXiv:2502.02175

citations

VladVA: Discriminative Fine-tuning of LVLMs

Yassine Ouali, Adrian Bulat, ALEXANDROS XENOS et al.

CVPR 2025posterarXiv:2412.04378

citations

VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models

Chongkai Gao, Zixuan Liu, Zhenghao Chi et al.

NeurIPS 2025posterarXiv:2506.17561

citations

VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation

Wei Zhao, Pengxiang Ding, Zhang Min et al.

ICLR 2025posterarXiv:2502.13508

citations

VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration

Dezhan Tu, Danylo Vashchilenko, Yuzhe Lu et al.

ICLR 2025posterarXiv:2410.23317

VLDrive: Vision-Augmented Lightweight MLLMs for Efficient Language-grounded Autonomous Driving

Ruifei Zhang, Wei Zhang, Xiao Tan et al.

ICCV 2025posterarXiv:2511.06256

citations

← Previous

1...423 424 425 426 427...438