2025 Papers

21,856 papers found • Page 423 of 438

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.

AAAI 2025paperarXiv:2405.18425
8
citations

ViiNeuS: Volumetric Initialization for Implicit Neural Surface Reconstruction of Urban Scenes with Limited Image Overlap

Hala Djeghim, Nathan Piasco, Moussab Bennehar et al.

CVPR 2025posterarXiv:2403.10344
4
citations

ViKIENet: Towards Efficient 3D Object Detection with Virtual Key Instance Enhanced Network

Zhuochen Yu, Bijie Qiu, Andy W. H. Khong

CVPR 2025poster

VIKING: Deep variational inference with stochastic projections

Samuel Matthiesen, Hrittik Roy, Nicholas Krämer et al.

NeurIPS 2025posterarXiv:2510.23684

VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

Li Kang, Xiufeng Song, Heng Zhou et al.

NeurIPS 2025posterarXiv:2506.09049
8
citations

VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge

Vishwesh Nath, Wenqi Li, Dong Yang et al.

CVPR 2025highlightarXiv:2411.12915
29
citations

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Yecheng Wu, Zhuoyang Zhang, Junyu Chen et al.

ICLR 2025posterarXiv:2409.04429

ViLLa: Video Reasoning Segmentation with Large Language Model

rongkun Zheng, Lu Qi, Xi Chen et al.

ICCV 2025posterarXiv:2407.14500
16
citations

ViLU: Learning Vision-Language Uncertainties for Failure Prediction

Marc Lafon, Yannis Karmim, Julio Silva-Rodríguez et al.

ICCV 2025posterarXiv:2507.07620
2
citations

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

Haidong Xu, Guangwei Xu, Zhedong Zheng et al.

NeurIPS 2025posterarXiv:2508.12081
1
citations

ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba

Juncan Deng, Shuaiting Li, Zeyu Wang et al.

ICCV 2025posterarXiv:2503.09509

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Silin Gao, Sheryl Mathew, Li Mi et al.

CVPR 2025posterarXiv:2503.20871

Vinci: Deep Thinking in Text-to-Image Generation using Unified Model with Reinforcement Learning

wang lin, Wentao Hu, Liyu Jia et al.

NeurIPS 2025poster

VinePPO: Refining Credit Assignment in RL Training of LLMs

Amirhossein Kazemnejad, Milad Aghajohari, Eva Portelance et al.

ICML 2025posterarXiv:2410.01679
48
citations

VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation

Saksham Singh Kushwaha, Yapeng Tian

CVPR 2025posterarXiv:2412.10768
12
citations

Vintix: Action Model via In-Context Reinforcement Learning

Andrei Polubarov, Nikita Lyubaykin, Alexander Derevyagin et al.

ICML 2025posterarXiv:2501.19400

VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things

Yaoyao Zhong, Mengshi Qi, Rui Wang et al.

AAAI 2025paper

VIPAMIN: Visual Prompt Initialization via Embedding Selection and Subspace Expansion

Jaekyun Park, Hye Won Chung

NeurIPS 2025posterarXiv:2510.16446

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning

Taewhan Kim, Soeun Lee, Si-Woo Kim et al.

AAAI 2025paperarXiv:2412.19289

VIPerson: Flexibly Generating Virtual Identity for Person Re-Identification

Xiao-Wen Zhang, Delong Zhang, Yi-Xing Peng et al.

ICCV 2025poster

V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim et al.

ICCV 2025posterarXiv:2508.03254

ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction

Yi Feng, Yu Han, Xijing Zhang et al.

AAAI 2025paperarXiv:2412.11210
5
citations

VIP: Vision Instructed Pre-training for Robotic Manipulation

Zhuoling Li, LiangLiang Ren, Jinrong Yang et al.

ICML 2025posterarXiv:2410.07169

VIRES: Video Instance Repainting via Sketch and Text Guided Generation

Shuchen Weng, Haojie Zheng, Peixuan Zhang et al.

CVPR 2025posterarXiv:2411.16199
1
citations

Virtual Fitting Room: Generating Arbitrarily Long Videos of Virtual Try-On from a Single Image

Junkun Chen, Aayush Bansal, Minh Vo et al.

NeurIPS 2025oralarXiv:2509.04450

Virtual Museum Tour Agent: Effects of Responsiveness and Awareness

Anant Upadhyay, Fu-Chia Yang, Christos Mousas

ISMAR 2025paper

Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning

Xingbo Fu, Zihan Chen, Yinhan He et al.

AAAI 2025paperarXiv:2412.19229

Virtual Pass-through: Evaluating 3D Gaussian Splatting as an Alternative to Conventional Video Pass-through in Static Environments

Andy Schleising, Christian Kunert, Tobias Schwandt et al.

ISMAR 2025paper
1
citations

Virtual Roomie: Immersive Layout Co-design with a Virtual Agent

Angela L. Jimenez, Pedro Acevedo, Christos Mousas

ISMAR 2025paper

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Zi Liang, Qingqing Ye, Xuan Liu et al.

NeurIPS 2025spotlight

ViSAGe: Video-to-Spatial Audio Generation

Jaeyeon Kim, Heeseung Yun, Gunhee Kim

ICLR 2025oralarXiv:2506.12199

Visceral Notices and Privacy Mechanisms for Eye Tracking in Augmented Reality

Nissi Otoo, Kailon Blue, G. Nikki Ramirez et al.

ISMAR 2025paper
1
citations

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Xueqing Wu, Yuheng Ding, Bingxuan Li et al.

CVPR 2025posterarXiv:2412.02172
13
citations

VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction, Characterization and Recognition

Rahul Moorthy Mahesh, Jun-Jee Chao, Volkan Isler

NeurIPS 2025poster
2
citations

VisHall3D: Monocular Semantic Scene Completion from Reconstructing the Visible Regions to Hallucinating the Invisible Regions

Haoang Lu, Yuanqi Su, Xiaoning Zhang et al.

ICCV 2025posterarXiv:2507.19188
1
citations

Vision and Language Synergy for Rehearsal Free Continual Learning

Muhammad Anwar Masum, Mahardhika Pratama, Savitha Ramasamy et al.

ICLR 2025poster

Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It

Yulu Qin, Dheeraj Varghese, Adam Dahlgren Lindström et al.

NeurIPS 2025oralarXiv:2507.13328

VisionArena: 230k Real World User-VLM Conversations with Preference Labels

Christopher Chou, Lisa Dunlap, Wei-Lin Chiang et al.

CVPR 2025posterarXiv:2412.08687
12
citations

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Jiaming Han, Hao Chen, Yang Zhao et al.

NeurIPS 2025posterarXiv:2506.18898

Vision-aware Multimodal Prompt Tuning for Uploadable Multi-source Few-shot Domain Adaptation

Kuanghong Liu, Jin Wang, Kangjian He et al.

AAAI 2025paperarXiv:2503.06106
2
citations

Vision-Based Generic Potential Function for Policy Alignment in Multi-Agent Reinforcement Learning

Hao Ma, Shijie Wang, Zhiqiang Pu et al.

AAAI 2025paperarXiv:2502.13430

Vision-centric Token Compression in Large Language Model

Ling Xing, Alex Jinpeng Wang, Rui Yan et al.

NeurIPS 2025spotlightarXiv:2502.00791
7
citations

Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations

Yudi Xie, Weichen Huang, Esther Alter et al.

ICLR 2025posterarXiv:2412.09115
3
citations

Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation

Zheng Anlin, Xin Wen, Xuanyang Zhang et al.

NeurIPS 2025poster
8
citations

Vision Function Layer in Multimodal LLMs

Cheng Shi, Yizhou Yu, Sibei Yang

NeurIPS 2025posterarXiv:2509.24791
3
citations

Vision Graph Prompting via Semantic Low-Rank Decomposition

Zixiang Ai, Zichen Liu, Jiahuan Zhou

ICML 2025posterarXiv:2505.04121

Vision-Guided Action: Enhancing 3D Human Motion Prediction with Gaze-informed Affordance in 3D Scenes

Ting Yu, Yi Lin, Jun Yu et al.

CVPR 2025poster
1
citations

Vision-guided Text Mining for Unsupervised Cross-modal Hashing with Community Similarity Quantization

Haozhi Fan, Yuan Cao

AAAI 2025paper

Vision-Language Embodiment for Monocular Depth Estimation

Jinchang Zhang, Guoyu Lu

CVPR 2025posterarXiv:2503.16535
3
citations

Vision-Language Gradient Descent-driven All-in-One Deep Unfolding Networks

Haijin Zeng, Xiangming Wang, Yongyong Chen et al.

CVPR 2025posterarXiv:2503.16930
14
citations