Most Cited 2025 "linear rewards" Papers

22,274 papers found • Page 34 of 112

Filters:Most Cited 2025 linear rewards Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#6601

Constant Bit-size Transformers Are Turing Complete

Qian Li, Yuyi Wang

NEURIPS 2025oralarXiv:2506.12027

citations

#6602

Towards foundational LiDAR world models with efficient latent flow matching

Tianran Liu, Shengwen Zhao, Nicholas Rhinehart

NEURIPS 2025arXiv:2506.23434

citations

#6603

Towards Doctor-Like Reasoning: Medical RAG Fusing Knowledge with Patient Analogy through Textual Gradients

Yuxing Lu, Gecheng Fu, Wei Wu et al.

NEURIPS 2025

citations

#6604

RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes

Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.

ICCV 2025arXiv:2506.01379

citations

#6605

Logits DeConfusion with CLIP for Few-Shot Learning

Shuo Li, Fang Liu, Zehua Hao et al.

CVPR 2025arXiv:2504.12104

citations

#6606

ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback

Litao Guo, Xinli Xu, Luozhou Wang et al.

NEURIPS 2025arXiv:2505.17908

citations

#6607

Doubly Robust Alignment for Large Language Models

Erhan Xu, Kai Ye, Hongyi Zhou et al.

NEURIPS 2025arXiv:2506.01183

citations

#6608

UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting

Ziyi Wang, Yanran Zhang, Jie Zhou et al.

CVPR 2025arXiv:2506.09952

citations

#6609

Encoder-Decoder Diffusion Language Models for Efficient Training and Inference

Marianne Arriola, Yair Schiff, Hao Phung et al.

NEURIPS 2025arXiv:2510.22852

citations

#6610

EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data

Ryan Punamiya, Dhruv Patel, Patcharapong Aphiwetsa et al.

NEURIPS 2025arXiv:2509.19626

citations

#6611

Golden Cudgel Network for Real-Time Semantic Segmentation

Guoyu Yang, Yuan Wang, Daming Shi et al.

CVPR 2025arXiv:2503.03325

citations

#6612

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Wang, Xiao Yang, Qingyong Hu et al.

NEURIPS 2025arXiv:2507.19172

citations

#6613

From Elements to Design: A Layered Approach for Automatic Graphic Design Composition

Jiawei Lin, Shizhao Sun, Danqing Huang et al.

CVPR 2025arXiv:2412.19712

citations

#6614

Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation

Edward LOO, Jiacheng Deng

CVPR 2025arXiv:2506.17891

citations

#6615

CAT: Content-Adaptive Image Tokenization

Junhong Shen, Kushal Tirumala, Michihiro Yasunaga et al.

NEURIPS 2025arXiv:2501.03120

citations

#6616

ZPressor: Bottleneck-Aware Compression for Scalable Feed-Forward 3DGS

Weijie Wang, Donny Y. Chen, Zeyu Zhang et al.

NEURIPS 2025arXiv:2505.23734

citations

#6617

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Bingquan Dai, Luo Li, Qihong Tang et al.

NEURIPS 2025arXiv:2508.14879

citations

#6618

CoMatcher: Multi-View Collaborative Feature Matching

Jintao Zhang, Zimin Xia, Mingyue Dong et al.

CVPR 2025arXiv:2504.01872

citations

#6619

GoRA: Gradient-driven Adaptive Low Rank Adaptation

haonan he, Peng Ye, Yuchen Ren et al.

NEURIPS 2025arXiv:2502.12171

citations

#6620

SEEA-R1: Tree-Structured Reinforcement Fine-Tuning for Self-Evolving Embodied Agents

Wanxin Tian, Shijie Zhang, Kevin Zhang et al.

NEURIPS 2025arXiv:2506.21669

citations

#6621

VALLR: Visual ASR Language Model for Lip Reading

Marshall Thomas, Edward Fish, Richard Bowden

ICCV 2025arXiv:2503.21408

citations

#6622

RePerformer: Immersive Human-centric Volumetric Videos from Playback to Photoreal Reperformance

Yuheng Jiang, Zhehao Shen, Chengcheng Guo et al.

CVPR 2025arXiv:2503.12242

citations

#6623

Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers

Kazuki Irie, Morris Yau, Samuel J Gershman

NEURIPS 2025arXiv:2506.00744

citations

#6624

IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement

Zhihao Shi, Dong Huo, Yuhongze Zhou et al.

CVPR 2025arXiv:2503.04501

citations

#6625

7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting

Zhongpai Gao, Benjamin Planche, Meng Zheng et al.

ICCV 2025arXiv:2503.07946

citations

#6626

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Ho Kei Cheng, Alex Schwing

ICCV 2025arXiv:2503.10636

citations

#6627

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025arXiv:2411.12858

citations

#6628

AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation

Qingqiu Li, Zihang Cui, Seongsu Bae et al.

NEURIPS 2025arXiv:2505.02830

citations

#6629

Scaling Speculative Decoding with Lookahead Reasoning

Yichao Fu, Rui Ge, Zelei Shao et al.

NEURIPS 2025arXiv:2506.19830

citations

#6630

DMWM: Dual-Mind World Model with Long-Term Imagination

Lingyi Wang, Rashed Shelim, Walid Saad et al.

NEURIPS 2025spotlightarXiv:2502.07591

citations

#6631

Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning

Xingjian Ran, Yixuan Li, Linning Xu et al.

NEURIPS 2025arXiv:2506.05341

citations

#6632

AD-GS: Object-Aware B-Spline Gaussian Splatting for Self-Supervised Autonomous Driving

Jiawei Xu, Kai Deng, Zexin Fan et al.

ICCV 2025arXiv:2507.12137

citations

#6633

TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

Jialin Chen, Ziyu Zhao, Gaukhar Nurbek et al.

NEURIPS 2025oralarXiv:2506.09114

citations

#6634

LayerAnimate: Layer-level Control for Animation

Yuxue Yang, Lue Fan, Zuzeng Lin et al.

ICCV 2025arXiv:2501.08295

citations

#6635

Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps

Chong Cheng, Sicheng Yu, Zijian Wang et al.

ICCV 2025arXiv:2507.03737

citations

#6636

GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation

Ruihai Wu, Ziyu Zhu, Yuran Wang et al.

CVPR 2025arXiv:2503.09243

citations

#6637

Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification

S P Sharan, Minkyu Choi, Sahil Shah et al.

CVPR 2025arXiv:2411.16718

citations

#6638

FAIR Universe HiggsML Uncertainty Dataset and Competition

Wahid Bhimji, Ragansu Chakkappai, Po-Wen Chang et al.

NEURIPS 2025arXiv:2410.02867

citations

#6639

Where, What, Why: Towards Explainable Driver Attention Prediction

Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.

ICCV 2025highlightarXiv:2506.23088

citations

#6640

ROSE: Remove Objects with Side Effects in Videos

Chenxuan Miao, Yutong Feng, Jianshu Zeng et al.

NEURIPS 2025arXiv:2508.18633

citations

#6641

Scaling Physical Reasoning with the PHYSICS Dataset

Shenghe Zheng, Qianjia Cheng, Junchi Yao et al.

NEURIPS 2025arXiv:2506.00022

citations

#6642

Generative Graph Pattern Machine

Zehong Wang, Zheyuan Zhang, Tianyi Ma et al.

NEURIPS 2025arXiv:2505.16130

citations

#6643

Distillation Robustifies Unlearning

Bruce W, Lee, Addie Foote, Alex Infanger et al.

NEURIPS 2025spotlightarXiv:2506.06278

citations

#6644

Good, Cheap, and Fast: Overfitted Image Compression with Wasserstein Distortion

Jona Ballé, Luca Versari, Emilien Dupont et al.

CVPR 2025highlightarXiv:2412.00505

citations

#6645

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Qizhou Chen, Chengyu Wang, Dakan Wang et al.

CVPR 2025arXiv:2411.15432

citations

#6646

LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions

Hadi Askari, Shivanshu Gupta, Fei Wang et al.

NEURIPS 2025arXiv:2505.23811

citations

#6647

CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving

Rui Song, Chenwei Liang, Yan Xia et al.

ICCV 2025arXiv:2503.06744

citations

#6648

TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation

Ruineng Li, Daitao Xing, Huiming Sun et al.

CVPR 2025arXiv:2504.08181

citations

#6649

Augmented Deep Contexts for Spatially Embedded Video Coding

Yifan Bian, Chuanbo Tang, Li Li et al.

CVPR 2025highlightarXiv:2505.05309

citations

#6650

LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition

Jinghan You, Shanglin Li, Yuanrui Sun et al.

ICCV 2025highlightarXiv:2501.13420

citations

#6651

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander

CVPR 2025arXiv:2503.23220

citations

#6652

CODA: Repurposing Continuous VAEs for Discrete Tokenization

Zeyu Liu, Zanlin Ni, Yeguo Hua et al.

ICCV 2025arXiv:2503.17760

citations

#6653

Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search

Haoran Sun, Yankai Jiang, Wenjie Lou et al.

NEURIPS 2025arXiv:2506.16962

citations

#6654

4DTAM: Non-Rigid Tracking and Mapping via Dynamic Surface Gaussians

Hidenobu Matsuki, Gwangbin Bae, Andrew J. Davison

CVPR 2025arXiv:2505.22859

citations

#6655

Probing Equivariance and Symmetry Breaking in Convolutional Networks

Sharvaree Vadgama, Mohammad Islam, Domas Buracas et al.

NEURIPS 2025arXiv:2501.01999

citations

#6656

SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning

Yiting Wang, Wanghao Ye, Ping Guo et al.

NEURIPS 2025arXiv:2504.10369

citations

#6657

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Fucai Ke, Vijay Kumar b g, Xingjian Leng et al.

ICCV 2025arXiv:2503.19263

citations

#6658

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025arXiv:2505.19618

citations

#6659

Uncertain Multimodal Intention and Emotion Understanding in the Wild

Qu Yang, QingHongYa Shi, Tongxin Wang et al.

CVPR 2025

citations

#6660

FinMMR: Make Financial Numerical Reasoning More Multimodal, Comprehensive, and Challenging

Zichen Tang, Haihong E, Jiacheng Liu et al.

ICCV 2025arXiv:2508.04625

citations

#6661

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Zhiqi Ge, Juncheng Li, Xinglei Pang et al.

ICCV 2025arXiv:2412.10342

citations

#6662

TaDiCodec: Text-aware Diffusion Speech Tokenizer for Speech Language Modeling

Yuancheng Wang, Dekun Chen, Xueyao Zhang et al.

NEURIPS 2025arXiv:2508.16790

citations

#6663

DISCOVER: Automated Curricula for Sparse-Reward Reinforcement Learning

Leander Diaz-Bone, Marco Bagatella, Jonas Hübotter et al.

NEURIPS 2025arXiv:2505.19850

citations

#6664

DeRIS: Decoupling Perception and Cognition for Enhanced Referring Image Segmentation through Loopback Synergy

Ming Dai, Wenxuan Cheng, Jiang-Jiang Liu et al.

ICCV 2025arXiv:2507.01738

citations

#6665

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782

citations

#6666

COAP: Memory-Efficient Training with Correlation-Aware Gradient Projection

Jinqi Xiao, Shen Sang, Tiancheng Zhi et al.

CVPR 2025arXiv:2412.00071

citations

#6667

UIBDiffusion: Universal Imperceptible Backdoor Attack for Diffusion Models

Yuning Han, Bingyin Zhao, Rui Chu et al.

CVPR 2025highlightarXiv:2412.11441

citations

#6668

ReWind: Understanding Long Videos with Instructed Learnable Memory

Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.

CVPR 2025arXiv:2411.15556

citations

#6669

Differentially Private Fine-Tuning of Diffusion Models

Yu-Lin Tsai, Yizhe Li, Zekai Chen et al.

ICCV 2025arXiv:2406.01355

citations

#6670

Breaking the Frozen Subspace: Importance Sampling for Low-Rank Optimization in LLM Pretraining

Haochen Zhang, Junze Yin, Guanchu Wang et al.

NEURIPS 2025arXiv:2502.05790

citations

#6671

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

Zhen Zeng, Leijiang Gu, Xun Yang et al.

ICCV 2025arXiv:2411.12790

citations

#6672

Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method

Han Wang, Shengyang Li, Jian Yang et al.

ICCV 2025arXiv:2506.22027

citations

#6673

Angular Steering: Behavior Control via Rotation in Activation Space

Minh Hieu Vu, Tan Nguyen

NEURIPS 2025oralarXiv:2510.26243

citations

#6674

Reducing Unimodal Bias in Multi-Modal Semantic Segmentation with Multi-Scale Functional Entropy Regularization

Xu Zheng, Yuanhuiyi Lyu, Lutao Jiang et al.

ICCV 2025arXiv:2505.06635

citations

#6675

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025arXiv:2504.13065

citations

#6676

Differentiable Generalized Sliced Wasserstein Plans

Laetitia Chapel, Romain Tavenard, Samuel Vaiter

NEURIPS 2025arXiv:2505.22049

citations

#6677

Future-Aware End-to-End Driving: Bidirectional Modeling of Trajectory Planning and Scene Evolution

Bozhou Zhang, Nan Song, jingyu li et al.

NEURIPS 2025oralarXiv:2510.11092

citations

#6678

BF-STVSR: B-Splines and Fourier---Best Friends for High Fidelity Spatial-Temporal Video Super-Resolution

Eunjin Kim, HYEONJIN KIM, Kyong Hwan Jin et al.

CVPR 2025arXiv:2501.11043

citations

#6679

Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment

ying ba, Tianyu Zhang, Yalong Bai et al.

ICCV 2025arXiv:2507.19002

citations

#6680

Signal and Noise: A Framework for Reducing Uncertainty in Language Model Evaluation

David Heineman, Valentin Hofmann, Ian Magnusson et al.

NEURIPS 2025spotlightarXiv:2508.13144

citations

#6681

DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs

Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.

NEURIPS 2025

citations

#6682

DeRS: Towards Extremely Efficient Upcycled Mixture-of-Experts Models

Yongqi Huang, Peng Ye, Chenyu Huang et al.

CVPR 2025arXiv:2503.01359

citations

#6683

What Matters in Data for DPO?

Yu Pan, Zhongze Cai, Huaiyang Zhong et al.

NEURIPS 2025arXiv:2508.18312

citations

#6684

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025arXiv:2504.01961

citations

#6685

Time-Aware Auto White Balance in Mobile Photography

Mahmoud Afifi, Luxi Zhao, Abhijith Punnappurath et al.

ICCV 2025arXiv:2504.05623

citations

#6686

Thinking in Character: Advancing Role-Playing Agents with Role-Aware Reasoning

Yihong Tang, Kehai Chen, Muyun Yang et al.

NEURIPS 2025arXiv:2506.01748

citations

#6687

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

CVPR 2025highlightarXiv:2411.15678

citations

#6688

Alternating Gradient Flows: A Theory of Feature Learning in Two-layer Neural Networks

Daniel Kunin, Giovanni Luca Marchetti, Feng Chen et al.

NEURIPS 2025arXiv:2506.06489

citations

#6689

OmniVCus: Feedforward Subject-driven Video Customization with Multimodal Control Conditions

Yuanhao Cai, HE Zhang, Xi Chen et al.

NEURIPS 2025oralarXiv:2506.23361

citations

#6690

B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens

Zhuqiang Lu, Zhenfei Yin, Mengwei He et al.

ICCV 2025arXiv:2412.09919

citations

#6691

RIGNO: A Graph-based Framework For Robust And Accurate Operator Learning For PDEs On Arbitrary Domains

Sepehr Mousavi, Shizheng Wen, Levi Lingsch et al.

NEURIPS 2025oralarXiv:2501.19205

citations

#6692

From Easy to Hard: Progressive Active Learning Framework for Infrared Small Target Detection with Single Point Supervision

Chuang Yu, Jinmiao Zhao, Yunpeng Liu et al.

ICCV 2025arXiv:2412.11154

citations

#6693

Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting

ChengAo Shen, Wenchao Yu, Ziming Zhao et al.

NEURIPS 2025arXiv:2505.24003

citations

#6694

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025arXiv:2407.07174

citations

#6695

Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime

Amit Attia, Matan Schliserman, Uri Sherman et al.

NEURIPS 2025arXiv:2507.11274

citations

#6696

Zeroth-Order Fine-Tuning of LLMs in Random Subspaces

Ziming Yu, Pan Zhou, Sike Wang et al.

ICCV 2025arXiv:2410.08989

citations

#6697

MedSegFactory: Text-Guided Generation of Medical Image-Mask Pairs

Jiawei Mao, Yuhan Wang, Yucheng Tang et al.

ICCV 2025arXiv:2504.06897

citations

#6698

``Principal Components" Enable A New Language of Images

Xin Wen, Bingchen Zhao, Ismail Elezi et al.

ICCV 2025

citations

#6699

Distributionally Robust Learning for Multi-source Unsupervised Domain Adaptation

Zhenyu Wang, Peter Bühlmann, Zijian Guo

NEURIPS 2025arXiv:2309.02211

citations

#6700

PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning

Yan Zhang, Yao Feng, Alpár Cseke et al.

ICCV 2025arXiv:2503.17544

citations

#6701

Emulating Self-attention with Convolution for Efficient Image Super-Resolution

Dongheon Lee, Seokju Yun, Youngmin Ro

ICCV 2025highlightarXiv:2503.06671

citations

#6702

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Xianhang Li, Yanqing Liu, Haoqin Tu et al.

ICCV 2025arXiv:2505.04601

citations

#6703

Patch Matters: Training-free Fine-grained Image Caption Enhancement via Local Perception

ruotian peng, Haiying He, Yake Wei et al.

CVPR 2025arXiv:2504.06666

citations

#6704

EVOS: Efficient Implicit Neural Training via EVOlutionary Selector

Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.

CVPR 2025arXiv:2412.10153

citations

#6705

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637

citations

#6706

SVIP: Semantically Contextualized Visual Patches for Zero-Shot Learning

Zhi Chen, Zecheng Zhao, Jingcai Guo et al.

ICCV 2025arXiv:2503.10252

citations

#6707

Unifying Re-Identification, Attribute Inference, and Data Reconstruction Risks in Differential Privacy

Bogdan Kulynych, Juan Gomez, Georgios Kaissis et al.

NEURIPS 2025arXiv:2507.06969

citations

#6708

Learning Interpretable Queries for Explainable Image Classification with Information Pursuit

Stefan Kolek, Aditya Chattopadhyay, Kwan Ho Ryan Chan et al.

ICCV 2025arXiv:2312.11548

citations

#6709

Motion Modes: What Could Happen Next?

Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.

CVPR 2025arXiv:2412.00148

citations

#6710

Large Language Models Think Too Fast To Explore Effectively

Lan Pan, Hanbo Xie, Robert Wilson

NEURIPS 2025arXiv:2501.18009

citations

#6711

Decoupling Fine Detail and Global Geometry for Compressed Depth Map Super-Resolution

Huan Zheng, Wencheng Han, Jianbing Shen

CVPR 2025arXiv:2411.03239

citations

#6712

TurboReg: TurboClique for Robust and Efficient Point Cloud Registration

Shaocheng Yan, Pengcheng Shi, Zhenjun Zhao et al.

ICCV 2025arXiv:2507.01439

citations

#6713

Robust Machine Unlearning for Quantized Neural Networks via Adaptive Gradient Reweighting with Similar Labels

Yujia Tong, Yuze Wang, Jingling Yuan et al.

ICCV 2025arXiv:2503.13917

citations

#6714

ReAL-AD: Towards Human-Like Reasoning in End-to-End Autonomous Driving

Yuhang Lu, Jiadong Tu, Yuexin Ma et al.

ICCV 2025arXiv:2507.12499

citations

#6715

Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration

Haipeng Fang, Sheng Tang, Juan Cao et al.

CVPR 2025arXiv:2505.11707

citations

#6716

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang et al.

ICCV 2025arXiv:2505.23044

citations

#6717

Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection

Herun Wan, Jiaying Wu, Minnan Luo et al.

NEURIPS 2025arXiv:2506.02350

citations

#6718

mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework

Bingyi Liu, Jian Teng, Hongfei Xue et al.

ICCV 2025arXiv:2501.12263

citations

#6719

Distilled Prompt Learning for Incomplete Multimodal Survival Prediction

Yingxue Xu, Fengtao ZHOU, Chenyu Zhao et al.

CVPR 2025arXiv:2503.01653

citations

#6720

Rethinking Bimanual Robotic Manipulation: Learning with Decoupled Interaction Framework

Jian-Jian Jiang, Xiao-Ming Wu, Yi-Xiang He et al.

ICCV 2025arXiv:2503.09186

citations

#6721

DynamicFace: High-Quality and Consistent Face Swapping for Image and Video using Composable 3D Facial Priors

Runqi Wang, Yang Chen, Sijie Xu et al.

ICCV 2025arXiv:2501.08553

citations

#6722

TokenUnify: Scaling Up Autoregressive Pretraining for Neuron Segmentation

Yinda Chen, Haoyuan Shi, Xiaoyu Liu et al.

ICCV 2025arXiv:2405.16847

citations

#6723

Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis

Zixuan Wang, DUO PENG, Feng Chen et al.

CVPR 2025arXiv:2504.01515

citations

#6724

LeanVAE: An Ultra-Efficient Reconstruction VAE for Video Diffusion Models

Yu Cheng, Fajie Yuan

ICCV 2025arXiv:2503.14325

citations

#6725

EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World?

Yuqian Yuan, Ronghao Dang, long li et al.

NEURIPS 2025oralarXiv:2506.05287

citations

#6726

Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers

Zhengliang Shi, Lingyong Yan, Dawei Yin et al.

NEURIPS 2025arXiv:2505.20128

citations

#6727

Skip-Vision: Efficient and Scalable Acceleration of Vision-Language Models via Adaptive Token Skipping

Weili Zeng, Ziyuan Huang, Kaixiang Ji et al.

ICCV 2025arXiv:2503.21817

citations

#6728

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Kim Sung-Bin, Jeongsoo Choi, Puyuan Peng et al.

ICCV 2025arXiv:2504.02386

citations

#6729

Language-Guided Audio-Visual Learning for Long-Term Sports Assessment

Huangbiao Xu, Xiao Ke, Huanqi Wu et al.

CVPR 2025

citations

#6730

RainyGS: Efficient Rain Synthesis with Physically-Based Gaussian Splatting

Qiyu Dai, Xingyu Ni, Qianfan Shen et al.

CVPR 2025arXiv:2503.21442

citations

#6731

Denoising Functional Maps: Diffusion Models for Shape Correspondence

Aleksei Zhuravlev, Zorah Lähner, Vladislav Golyanik

CVPR 2025arXiv:2503.01845

citations

#6732

Textured 3D Regenerative Morphing with 3D Diffusion Prior

Songlin Yang, Yushi LAN, Honghua Chen et al.

ICCV 2025arXiv:2502.14316

citations

#6733

Snakes and Ladders: Two Steps Up for VideoMamba

Hui Lu, Albert Ali Salah, Ronald Poppe

ICCV 2025arXiv:2406.19006

citations

#6734

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Wenhao Wang, Yi Yang

ICCV 2025arXiv:2411.04709

citations

#6735

ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models

Zifu Wan, Ce Zhang, Silong Yong et al.

ICCV 2025arXiv:2507.00898

citations

#6736

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025arXiv:2505.11182

citations

#6737

Dissecting Generalized Category Discovery: Multiplex Consensus under Self-Deconstruction

Luyao Tang, Kunze Huang, Yuxuan Yuan et al.

ICCV 2025highlightarXiv:2508.10731

citations

#6738

Dual-Process Image Generation

Grace Luo, Jonathan Granskog, Aleksander Holynski et al.

ICCV 2025arXiv:2506.01955

citations

#6739

CL-Splats: Continual Learning of Gaussian Splatting with Local Optimization

Jan Ackermann, Jonas Kulhanek, Shengqu Cai et al.

ICCV 2025arXiv:2506.21117

citations

#6740

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation

Abduljalil Radman, Jorma Laaksonen

CVPR 2025

citations

#6741

Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction

Yunheng Li, Yuxuan Li, Quan-Sheng Zeng et al.

ICCV 2025arXiv:2412.06244

citations

#6742

Towards Unified and Lossless Latent Space for 3D Molecular Latent Diffusion Modeling

Yanchen Luo, ZHIYUAN LIU, Yi Zhao et al.

NEURIPS 2025arXiv:2503.15567

citations

#6743

Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis

Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.

CVPR 2025arXiv:2501.09333

citations

#6744

SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting

Mengjiao Ma, Qi Ma, Yue Li et al.

NEURIPS 2025arXiv:2506.08710

citations

#6745

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Runze Zhang, Guoguang Du, Xiaochuan Li et al.

ICCV 2025highlightarXiv:2503.06053

citations

#6746

AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement

J Rosser, Jakob Foerster

NEURIPS 2025spotlightarXiv:2502.00757

citations

#6747

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

Xiao Chen, Tai Wang, Quanyi Li et al.

ICCV 2025arXiv:2505.20294

citations

#6748

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025arXiv:2505.04320

citations

#6749

SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization

Zhentao Tan, Ben Xue, Jian Jia et al.

ICCV 2025arXiv:2412.10443

citations

#6750

Scaling Offline RL via Efficient and Expressive Shortcut Models

Nicolas Espinosa-Dice, Yiyi Zhang, Yiding Chen et al.

NEURIPS 2025arXiv:2505.22866

citations

#6751

One is Plenty: A Polymorphic Feature Interpreter for Immutable Heterogeneous Collaborative Perception

Yuchen Xia, Quan Yuan, Guiyang Luo et al.

CVPR 2025arXiv:2411.16799

citations

#6752

Parameter Efficient Fine-tuning via Explained Variance Adaptation

Fabian Paischer, Lukas Hauzenberger, Thomas Schmied et al.

NEURIPS 2025arXiv:2410.07170

citations

#6753

Dense-SfM: Structure from Motion with Dense Consistent Matching

JongMin Lee, Sungjoo Yoo

CVPR 2025arXiv:2501.14277

citations

#6754

PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask

Jeongho Kim, Hoiyeong Jin, Sunghyun Park et al.

ICCV 2025arXiv:2412.16978

citations

#6755

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Kaining Ying, Henghui Ding, Guangquan Jie et al.

ICCV 2025arXiv:2507.22886

citations

#6756

On Denoising Walking Videos for Gait Recognition

Dongyang Jin, Chao Fan, Jingzhe Ma et al.

CVPR 2025arXiv:2505.18582

citations

#6757

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation

Jiahao Wang, Ning Kang, Lewei Yao et al.

ICCV 2025arXiv:2501.12976

citations

#6758

Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models

Quan Zhang, Jinwei Fang, Rui Yuan et al.

CVPR 2025arXiv:2411.08466

citations

#6759

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Zhigang Wang, Yifei Su, Chenhui Li et al.

ICCV 2025arXiv:2411.16253

citations

#6760

Behavior Injection: Preparing Language Models for Reinforcement Learning

Zhepeng Cen, Yihang Yao, William Han et al.

NEURIPS 2025arXiv:2505.18917

citations

#6761

SimMotionEdit: Text-Based Human Motion Editing with Motion Similarity Prediction

Zhengyuan Li, Kai Cheng, Anindita Ghosh et al.

CVPR 2025arXiv:2503.18211

citations

#6762

FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts

Heming Zou, Yunliang Zang, Wutong Xu et al.

NEURIPS 2025arXiv:2510.08396

citations

#6763

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178

citations

#6764

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu et al.

ICCV 2025arXiv:2507.02691

citations

#6765

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

Huanpeng Chu, Wei Wu, Guanyu Feng et al.

ICCV 2025arXiv:2508.16212

citations

#6766

$\Psi$-Sampler: Initial Particle Sampling for SMC-Based Inference-Time Reward Alignment in Score Models

Taehoon Yoon, Yunhong Min, Kyeongmin Yeo et al.

NEURIPS 2025spotlightarXiv:2506.01320

citations

#6767

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

Ao Ma, Jiasong Feng, Ke Cao et al.

ICCV 2025arXiv:2508.08949

citations

#6768

Augmented Mass-Spring Model for Real-Time Dense Hair Simulation

Jorge Herrera, Yi Zhou, Xin Sun et al.

ICCV 2025arXiv:2412.17144

citations

#6769

SpectralAR: Spectral Autoregressive Visual Generation

Yuanhui Huang, Weiliang Chen, Wenzhao Zheng et al.

ICCV 2025arXiv:2506.10962

citations

#6770

VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis

Zhifeng Wang, Renjiao Yi, Xin Wen et al.

CVPR 2025arXiv:2503.12758

citations

#6771

Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models

Shuyang Hao, Bryan Hooi, Jun Liu et al.

CVPR 2025arXiv:2411.18000

citations

#6772

ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation

Qizhen Lan, Qing Tian

ICCV 2025arXiv:2503.06307

citations

#6773

Harnessing Uncertainty-aware Bounding Boxes for Unsupervised 3D Object Detection

Ruiyang Zhang, Hu Zhang, Zhedong Zheng

ICCV 2025arXiv:2408.00619

citations

#6774

Beyond Scalars: Concept-Based Alignment Analysis in Vision Transformers

Johanna Vielhaben, Dilyara Bareeva, Jim Berend et al.

NEURIPS 2025arXiv:2412.06639

citations

#6775

Feature Coding in the Era of Large Models: Dataset, Test Conditions, and Benchmark

Changsheng Gao, Yifan Ma, Qiaoxi Chen et al.

ICCV 2025arXiv:2412.04307

citations

#6776

Edit360: 2D Image Edits to 3D Assets from Any Angle

Junchao Huang, Xinting Hu, Shaoshuai Shi et al.

ICCV 2025highlightarXiv:2506.10507

citations

#6777

Towards Realistic Example-based Modeling via 3D Gaussian Stitching

Xinyu Gao, Ziyi Yang, Bingchen Gong et al.

CVPR 2025arXiv:2408.15708

citations

#6778

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

Guanning Zeng, Xiang Zhang, Zirui Wang et al.

ICCV 2025arXiv:2508.00728

citations

#6779

Exploring Simple Open-Vocabulary Semantic Segmentation

Zihang Lai

CVPR 2025arXiv:2401.12217

citations

#6780

Thought Communication in Multiagent Collaboration

Yujia Zheng, Zhuokai Zhao, Zijian Li et al.

NEURIPS 2025spotlightarXiv:2510.20733

citations

#6781

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Yudong Jin, Sida Peng, Xuan Wang et al.

ICCV 2025arXiv:2507.13344

citations

#6782

WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images

Yansong Guo, Jie Hu, Yansong Qu et al.

ICCV 2025arXiv:2503.08407

citations

#6783

AdsQA: Towards Advertisement Video Understanding

Xinwei Long, Kai Tian, Peng Xu et al.

ICCV 2025arXiv:2509.08621

citations

#6784

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025arXiv:2503.17261

citations

#6785

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, Yaofu LIU et al.

ICCV 2025arXiv:2504.03140

citations

#6786

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.

NEURIPS 2025spotlightarXiv:2511.05664

citations

#6787

Towards In-the-wild 3D Plane Reconstruction from a Single Image

Jiachen Liu, Rui Yu, Sili Chen et al.

CVPR 2025highlightarXiv:2506.02493

citations

#6788

Lift3D Policy: Lifting 2D Foundation Models for Robust 3D Robotic Manipulation

Yueru Jia, Jiaming Liu, Sixiang Chen et al.

CVPR 2025

citations

#6789

GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting

Xiaobao Wei, Peng Chen, Guangyu Li et al.

ICCV 2025highlightarXiv:2411.12981

citations

#6790

DreamFuse: Adaptive Image Fusion with Diffusion Transformer

Junjia Huang, Pengxiang Yan, Jiyang Liu et al.

ICCV 2025arXiv:2504.08291

citations

#6791

GIViC: Generative Implicit Video Compression

Ge Gao, Siyue Teng, Tianhao Peng et al.

ICCV 2025arXiv:2503.19604

citations

#6792

SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL

Yue Gong, Chuan Lei, Xiao Qin et al.

NEURIPS 2025arXiv:2506.04494

citations

#6793

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528

citations

#6794

EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds

Lu Chen, Yizhou Wang, SHIXIANG TANG et al.

ICCV 2025arXiv:2502.05857

citations

#6795

VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment

Darshana Saravanan, Varun Gupta, Darshan Singh S et al.

CVPR 2025arXiv:2406.10889

citations

#6796

Hierarchical Cross-modal Prompt Learning for Vision-Language Models

Hao Zheng, Shunzhi Yang, Zhuoxin He et al.

ICCV 2025arXiv:2507.14976

citations

#6797

IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation

Zijie Lin, Yang Zhang, Xiaoyan Zhao et al.

NEURIPS 2025arXiv:2506.13229

citations

#6798

Temporal Representation Alignment: Successor Features Enable Emergent Compositionality in Robot Instruction Following

Vivek Myers, Bill Zheng, Anca Dragan et al.

NEURIPS 2025oralarXiv:2502.05454

citations

#6799

Self-Learning Hyperspectral and Multispectral Image Fusion via Adaptive Residual Guided Subspace Diffusion Model

Jian Zhu, He Wang, Yang Xu et al.

CVPR 2025arXiv:2505.11800

citations

#6800

DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving

Chen Shi, Shaoshuai Shi, Kehua Sheng et al.

ICCV 2025arXiv:2505.19239

citations

← Previous

1...32 33 34 35 36...112