Most Cited 2025 "linear rewards" Papers

22,274 papers found • Page 33 of 112

Filters:Most Cited 2025 linear rewards Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#6401

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.

CVPR 2025arXiv:2411.11909

citations

#6402

MI-DETR: An Object Detection Model with Multi-time Inquiries Mechanism

Zhixiong Nan, Xianghong Li, Tao Xiang et al.

CVPR 2025arXiv:2503.01463

citations

#6403

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen, Yuanxin Guo, Yue Ju et al.

NEURIPS 2025arXiv:2502.01755

citations

#6404

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

Zhuguanyu Wu, Jiayi Zhang, Jiaxin Chen et al.

CVPR 2025arXiv:2504.02508

citations

#6405

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Zeyi Huang, Yuyang Ji, Xiaofang Wang et al.

CVPR 2025arXiv:2501.04336

citations

#6406

Spatial Understanding from Videos: Structured Prompts Meet Simulation Data

Haoyu Zhang, Meng Liu, Zaijing Li et al.

NEURIPS 2025spotlightarXiv:2506.03642

citations

#6407

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Hyojin Bahng, Caroline Chan, Fredo Durand et al.

ICCV 2025arXiv:2506.02095

citations

#6408

Integral Imprecise Probability Metrics

Siu Lun (Alan) Chau, Michele Caprio, Krikamol Muandet

NEURIPS 2025arXiv:2505.16156

citations

#6409

Afterburner: Reinforcement Learning Facilitates Self-Improving Code Efficiency Optimization

Mingzhe Du, Anh Tuan Luu, Yue Liu et al.

NEURIPS 2025arXiv:2505.23387

citations

#6410

SubTrack++ : Gradient Subspace Tracking for Scalable LLM Training

Sahar Rajabi, Nayeema Nonta, Sirisha Rambhatla

NEURIPS 2025arXiv:2502.01586

citations

#6411

Co-op: Correspondence-based Novel Object Pose Estimation

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2025arXiv:2503.17731

citations

#6412

Language Driven Occupancy Prediction

Zhu Yu, Bowen Pang, Lizhe Liu et al.

ICCV 2025arXiv:2411.16072

citations

#6413

TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

Ruidong Chen, honglin guo, Lanjun Wang et al.

ICCV 2025arXiv:2503.07389

citations

#6414

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Feihong Yan, qingyan wei, Jiayi Tang et al.

ICCV 2025arXiv:2503.12450

citations

#6415

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang et al.

ICCV 2025arXiv:2502.03207

citations

#6416

OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images

Ziyue Huang, Yongchao Feng, Ziqi Liu et al.

ICCV 2025arXiv:2503.06146

citations

#6417

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.

CVPR 2025arXiv:2412.01822

citations

#6418

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Junli Liu, Qizhi Chen, Zhigang Wang et al.

ICCV 2025arXiv:2504.07836

citations

#6419

CWNet: Causal Wavelet Network for Low-Light Image Enhancement

Tongshun Zhang, Pingping Liu, Yubing Lu et al.

ICCV 2025arXiv:2507.10689

citations

#6420

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

Liuyi Wang, Xinyuan Xia, Hui Zhao et al.

ICCV 2025arXiv:2507.13019

citations

#6421

GARF: Learning Generalizable 3D Reassembly for Real-World Fractures

Sihang Li, Zeyu Jiang, Grace Chen et al.

ICCV 2025arXiv:2504.05400

citations

#6422

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches

Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif et al.

ICCV 2025arXiv:2311.12084

citations

#6423

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Yang LI, Jinglu Wang, Lei Chu et al.

ICCV 2025arXiv:2503.06235

citations

#6424

ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting

Ruijie Zhu, Mulin Yu, Linning Xu et al.

ICCV 2025arXiv:2507.15454

citations

#6425

TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Jinhao Duan, Fei Kong, Hao Cheng et al.

ICCV 2025

citations

#6426

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Kaidong Zhang, Rongtao Xu, Ren Pengzhen et al.

ICCV 2025arXiv:2505.01709

citations

#6427

Can Generative Video Models Help Pose Estimation?

Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.

CVPR 2025highlightarXiv:2412.16155

citations

#6428

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025arXiv:2502.03629

citations

#6429

Enhancing 3D Reconstruction for Dynamic Scenes

Jisang Han, Honggyu An, Jaewoo Jung et al.

NEURIPS 2025oralarXiv:2504.06264

citations

#6430

Object-centric binding in Contrastive Language-Image Pretraining

Rim Assouel, Pietro Astolfi, Florian Bordes et al.

NEURIPS 2025arXiv:2502.14113

citations

#6431

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391

citations

#6432

AgroBench: Vision-Language Model Benchmark in Agriculture

Risa Shinoda, Nakamasa Inoue, Hirokatsu Kataoka et al.

ICCV 2025arXiv:2507.20519

citations

#6433

DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models

Radu Alexandru Rosu, Keyu Wu, Yao Feng et al.

CVPR 2025arXiv:2505.06166

citations

#6434

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

Zhengrui Guo, Conghao Xiong, Jiabo MA et al.

CVPR 2025arXiv:2411.14743

citations

#6435

Salvaging the Overlooked: Leveraging Class-Aware Contrastive Learning for Multi-Class Anomaly Detection

Lei Fan, Junjie Huang, Donglin Di et al.

ICCV 2025arXiv:2412.04769

citations

#6436

GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting

Yusen XIE, Zhenmin Huang, Jin Wu et al.

ICCV 2025arXiv:2410.17084

citations

#6437

What Makes for Text to 360-degree Panorama Generation with Stable Diffusion?

Jinhong Ni, Chang-Bin Zhang, Qiang Zhang et al.

ICCV 2025arXiv:2505.22129

citations

#6438

The emergence of sparse attention: impact of data distribution and benefits of repetition

Nicolas Zucchet, Francesco D'Angelo, Andrew Lampinen et al.

NEURIPS 2025oralarXiv:2505.17863

citations

#6439

MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices

HAILONG YAN, Ao Li, Xiangtao Zhang et al.

ICCV 2025arXiv:2507.01838

citations

#6440

Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization

Hao Ju, Shaofei Huang, Si Liu et al.

ICCV 2025arXiv:2411.13610

citations

#6441

ARIG: Autoregressive Interactive Head Generation for Real-time Conversations

Ying Guo, Xi Liu, Cheng Zhen et al.

ICCV 2025arXiv:2507.00472

citations

#6442

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

Yudong Liu, Jingwei Sun, Yueqian Lin et al.

ICCV 2025arXiv:2503.10742

citations

#6443

Vision-Language Models Can't See the Obvious

YASSER ABDELAZIZ DAHOU DJILALI, Ngoc Huynh, Phúc Lê Khắc et al.

ICCV 2025arXiv:2507.04741

citations

#6444

DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Jungbin Cho, Junwan Kim, Jisoo Kim et al.

ICCV 2025highlightarXiv:2411.19527

citations

#6445

Semantic Watermarking Reinvented: Enhancing Robustness and Generation Quality with Fourier Integrity

Sung Ju Lee, Nam Ik Cho

ICCV 2025arXiv:2509.07647

citations

#6446

Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos

Rundong Luo, Matthew Wallingford, Ali Farhadi et al.

ICCV 2025arXiv:2504.07940

citations

#6447

REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents

Rui Tian, Qi Dai, Jianmin Bao et al.

ICCV 2025arXiv:2411.13552

citations

#6448

Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection

Romain Thoreau, Valerio Marsocci, Dawa Derksen

ICCV 2025arXiv:2503.09493

citations

#6449

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

ICCV 2025highlightarXiv:2506.22802

citations

#6450

SITE: towards Spatial Intelligence Thorough Evaluation

Wenqi Wang, Reuben Tan, Pengyue Zhu et al.

ICCV 2025arXiv:2505.05456

citations

#6451

TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning

Siqi Luo, Haoran Yang, Yi Xin et al.

ICCV 2025arXiv:2507.22872

citations

#6452

SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing

Yingying Zhang, Lixiang Ru, Kang Wu et al.

ICCV 2025arXiv:2507.13812

citations

#6453

Federated Continual Instruction Tuning

Haiyang Guo, Fanhu Zeng, Fei Zhu et al.

ICCV 2025arXiv:2503.12897

citations

#6454

Event-based Tiny Object Detection: A Benchmark Dataset and Baselines

Nuo Chen, Chao Xiao, Yimian Dai et al.

ICCV 2025arXiv:2506.23575

citations

#6455

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025arXiv:2503.05156

citations

#6456

Object-centric Video Question Answering with Visual Grounding and Referring

Haochen Wang, Qirui Chen, Cilin Yan et al.

ICCV 2025arXiv:2507.19599

citations

#6457

StableCodec: Taming One-Step Diffusion for Extreme Image Compression

Tianyu Zhang, Xin Luo, Li Li et al.

ICCV 2025arXiv:2506.21977

citations

#6458

PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Runze He, bo cheng, Yuhang Ma et al.

ICCV 2025arXiv:2503.10127

citations

#6459

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

Mengchen Zhang, Tong Wu, Jing Tan et al.

ICCV 2025arXiv:2504.07083

citations

#6460

SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data

Xilin He, Cheng Luo, Xiaole Xian et al.

ICCV 2025arXiv:2410.09865

citations

#6461

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Zichen Liu, Yihao Meng, Hao Ouyang et al.

ICCV 2025arXiv:2404.11614

citations

#6462

FlexGen: Flexible Multi-View Generation from Text and Image Inputs

Xinli Xu, Wenhang Ge, Jiantao Lin et al.

ICCV 2025arXiv:2410.10745

citations

#6463

Importance-Based Token Merging for Efficient Image and Video Generation

Haoyu Wu, Jingyi Xu, Hieu Le et al.

ICCV 2025arXiv:2411.16720

citations

#6464

Towards Real Unsupervised Anomaly Detection Via Confident Meta-Learning

Muhammad Aqeel, Shakiba Sharifi, Marco Cristani et al.

ICCV 2025arXiv:2508.02293

citations

#6465

CCMNet: Leveraging Calibrated Color Correction Matrices for Cross-Camera Color Constancy

Dongyoung Kim, Mahmoud Afifi, Dongyun Kim et al.

ICCV 2025arXiv:2504.07959

citations

#6466

FlowR: Flowing from Sparse to Dense 3D Reconstructions

Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.

ICCV 2025highlightarXiv:2504.01647

citations

#6467

Corvid: Improving Multimodal Large Language Models Towards Chain-of-Thought Reasoning

Jingjing Jiang, Chao Ma, Xurui Song et al.

ICCV 2025highlightarXiv:2507.07424

citations

#6468

RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

Yuxin Yao, Zhi Deng, Junhui Hou

CVPR 2025arXiv:2503.16822

citations

#6469

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

NEURIPS 2025arXiv:2504.09702

citations

#6470

Object-Shot Enhanced Grounding Network for Egocentric Video

Yisen Feng, Haoyu Zhang, Meng Liu et al.

CVPR 2025arXiv:2505.04270

citations

#6471

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317

citations

#6472

Robust Multimodal Survival Prediction with Conditional Latent Differentiation Variational AutoEncoder

Junjie Zhou, Jiao Tang, Yingli Zuo et al.

CVPR 2025

citations

#6473

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025arXiv:2312.04540

citations

#6474

COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training

Sanghwan Kim, Rui Xiao, Iuliana Georgescu et al.

CVPR 2025arXiv:2412.01814

citations

#6475

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Tongda Xu, Jiahao Li, Bin Li et al.

CVPR 2025arXiv:2505.05853

citations

#6476

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NEURIPS 2025arXiv:2504.00010

citations

#6477

Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning

Haolin Pan, Hongyu Lin, Haoran Luo et al.

NEURIPS 2025arXiv:2506.15701

citations

#6478

Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation

Shuling Zhao, Fa-Ting Hong, Xiaoshui Huang et al.

CVPR 2025arXiv:2412.00719

citations

#6479

On the Relation between Rectified Flows and Optimal Transport

Johannes Hertrich, Antonin Chambolle, Julie Delon

NEURIPS 2025arXiv:2505.19712

citations

#6480

Straight-Line Diffusion Model for Efficient 3D Molecular Generation

Yuyan Ni, Shikun Feng, Haohan Chi et al.

NEURIPS 2025arXiv:2503.02918

citations

#6481

Temporal Alignment-Free Video Matching for Few-shot Action Recognition

SuBeen Lee, WonJun Moon, Hyun Seok Seong et al.

CVPR 2025arXiv:2504.05956

citations

#6482

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.

NEURIPS 2025arXiv:2412.01784

citations

#6483

FreqDebias: Towards Generalizable Deepfake Detection via Consistency-Driven Frequency Debiasing

Hossein Kashiani, Niloufar Alipour Talemi, Fatemeh Afghah

CVPR 2025arXiv:2509.22412

citations

#6484

MMTL-UniAD: A Unified Framework for Multimodal and Multi-Task Learning in Assistive Driving Perception

Wenzhuo Liu, Wenshuo Wang, Yicheng Qiao et al.

CVPR 2025arXiv:2504.02264

citations

#6485

M3amba: Memory Mamba is All You Need for Whole Slide Image Classification

Tingting Zheng, Kui Jiang, Yi Xiao et al.

CVPR 2025

citations

#6486

Multitwine: Multi-Object Compositing with Text and Layout Control

Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.

CVPR 2025highlightarXiv:2502.05165

citations

#6487

PanDA: Towards Panoramic Depth Anything with Unlabeled Panoramas and Mobius Spatial Augmentation

Zidong Cao, Jinjing Zhu, Weiming Zhang et al.

CVPR 2025arXiv:2406.13378

citations

#6488

TCFG: Tangential Damping Classifier-free Guidance

Mingi Kwon, Shin seong Kim, Jaeseok Jeong et al.

CVPR 2025arXiv:2503.18137

citations

#6489

Navigating Image Restoration with VAR’s Distribution Alignment Prior

Siyang Wang, Naishan Zheng, Jie Huang et al.

CVPR 2025arXiv:2412.21063

citations

#6490

Real-IAD D³: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection

wenbing zhu, Lidong Wang, Ziqing Zhou et al.

CVPR 2025

citations

#6491

DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Caoshuo Li, Tanzhe Li, Xiaobin Hu et al.

CVPR 2025arXiv:2503.14867

citations

#6492

ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions

Tomas Soucek, Prajwal Gatti, Michael Wray et al.

CVPR 2025arXiv:2412.01987

citations

#6493

Shape it Up! Restoring LLM Safety during Finetuning

ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.

NEURIPS 2025arXiv:2505.17196

citations

#6494

FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution

Gene Chou, Wenqi Xian, Guandao Yang et al.

ICCV 2025highlightarXiv:2504.07093

citations

#6495

Scaffolding Dexterous Manipulation with Vision-Language Models

Vincent de Bakker, Joey Hejna, Tyler Lum et al.

NEURIPS 2025arXiv:2506.19212

citations

#6496

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025arXiv:2412.03752

citations

#6497

CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation

Reza Abbasi, Ali Nazari, Aminreza Sefid et al.

CVPR 2025arXiv:2502.19842

citations

#6498

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025arXiv:2503.15197

citations

#6499

Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions

Chan Hur, Jeong-hun Hong, Dong-hun Lee et al.

CVPR 2025arXiv:2503.05186

citations

#6500

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025arXiv:2412.13573

citations

#6501

Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning

Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.

NEURIPS 2025arXiv:2504.11409

citations

#6502

FlexOLMo: Open Language Models for Flexible Data Use

Weijia Shi, Akshita Bhagia, Kevin Farhat et al.

NEURIPS 2025spotlightarXiv:2507.07024

citations

#6503

Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild

Junhyeong Cho, Kim Youwang, Hunmin Yang et al.

CVPR 2025arXiv:2403.14539

citations

#6504

Generalizable, real-time neural decoding with hybrid state-space models

Avery Hee-Woon Ryoo, Nanda H Krishna, Ximeng Mao et al.

NEURIPS 2025arXiv:2506.05320

citations

#6505

SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction

ZaiPeng Duan, Xuzhong Hu, Pei An et al.

CVPR 2025arXiv:2507.17083

citations

#6506

Walking the Tightrope: Autonomous Disentangling Beneficial and Detrimental Drifts in Non-Stationary Custom-Tuning

Xiaoyu Yang, Jie Lu, En Yu

NEURIPS 2025oral

citations

#6507

Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation

Reza Qorbani, Gianluca Villani, Theodoros Panagiotakopoulos et al.

CVPR 2025arXiv:2503.21780

citations

#6508

Audio-Sync Video Generation with Multi-Stream Temporal Control

Shuchen Weng, Haojie Zheng, zheng chang et al.

NEURIPS 2025oralarXiv:2506.08003

citations

#6509

DLF: Extreme Image Compression with Dual-generative Latent Fusion

Naifu Xue, Zhaoyang Jia, Jiahao Li et al.

ICCV 2025highlightarXiv:2503.01428

citations

#6510

Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies

Yibo Wen, Chenwei Xu, Jerry Yao-Chieh Hu et al.

NEURIPS 2025arXiv:2412.20984

citations

#6511

A Stable Whitening Optimizer for Efficient Neural Network Training

Kevin Frans, Sergey Levine, Pieter Abbeel

NEURIPS 2025arXiv:2506.07254

citations

#6512

Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws

Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.

NEURIPS 2025spotlightarXiv:2504.09597

citations

#6513

GeoSVR: Taming Sparse Voxels for Geometrically Accurate Surface Reconstruction

Jiahe Li, Jiawei Zhang, Youmin Zhang et al.

NEURIPS 2025spotlightarXiv:2509.18090

citations

#6514

Estimating Model Performance Under Covariate Shift Without Labels

Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.

NEURIPS 2025arXiv:2401.08348

citations

#6515

Hearing Anywhere in Any Environment

Xiulong Liu, Anurag Kumar, Paul Calamia et al.

CVPR 2025arXiv:2504.10746

citations

#6516

One-Step Offline Distillation of Diffusion-based Models via Koopman Modeling

Nimrod Berman, Ilan Naiman, Moshe Eliasof et al.

NEURIPS 2025arXiv:2505.13358

citations

#6517

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Ching Chang, Jeehyun Hwang, Yidan Shi et al.

NEURIPS 2025arXiv:2506.10412

citations

#6518

A Tale of Two Symmetries: Exploring the Loss Landscape of Equivariant Models

YuQing Xie, Tess Smidt

NEURIPS 2025arXiv:2506.02269

citations

#6519

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.

ICCV 2025arXiv:2506.07986

citations

#6520

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

Yudong Han, Qingpei Guo, Liyuan Pan et al.

CVPR 2025arXiv:2411.12355

citations

#6521

GuardSplat: Efficient and Robust Watermarking for 3D Gaussian Splatting

Zixuan Chen, Guangcong Wang, Jiahao Zhu et al.

CVPR 2025arXiv:2411.19895

citations

#6522

Normalized Attention Guidance: Universal Negative Guidance for Diffusion Models

Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou et al.

NEURIPS 2025arXiv:2505.21179

citations

#6523

Seeing the Arrow of Time in Large Multimodal Models

Zihui (Sherry) Xue, Romy Luo, Kristen Grauman

NEURIPS 2025oralarXiv:2506.03340

citations

#6524

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980

citations

#6525

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464

citations

#6526

Keyframe-Guided Creative Video Inpainting

Yuwei Guo, Ceyuan Yang, Anyi Rao et al.

CVPR 2025

citations

#6527

Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

Yuchen Liang, Renxiang Huang, Lifeng LAI et al.

NEURIPS 2025arXiv:2506.02318

citations

#6528

CountLLM: Towards Generalizable Repetitive Action Counting via Large Language Model

Ziyu Yao, Xuxin Cheng, Zhiqi Huang et al.

CVPR 2025arXiv:2503.17690

citations

#6529

Parametric Point Cloud Completion for Polygonal Surface Reconstruction

Zhaiyu Chen, Yuqing Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.08363

citations

#6530

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

Yiyang Du, Xiaochen Wang, Chi Chen et al.

CVPR 2025arXiv:2503.23733

citations

#6531

POT: Prototypical Optimal Transport for Weakly Supervised Semantic Segmentation

Jian Wang, Tianhong Dai, Bingfeng Zhang et al.

CVPR 2025

citations

#6532

3D-MVP: 3D Multiview Pretraining for Manipulation

Shengyi Qian, Kaichun Mo, Valts Blukis et al.

CVPR 2025

citations

#6533

Diagnosing and Addressing Pitfalls in KG-RAG Datasets: Toward More Reliable Benchmarking

Liangliang Zhang, Zhuorui Jiang, Hongliang Chi et al.

NEURIPS 2025arXiv:2505.23495

citations

#6534

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025arXiv:2503.15185

citations

#6535

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities

Haoyu Zhao, Yihan Geng, Shange Tang et al.

NEURIPS 2025arXiv:2505.12680

citations

#6536

LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Jian Jin, Zhenbo Yu, Yang Shen et al.

CVPR 2025highlightarXiv:2503.06956

citations

#6537

AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs

Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.

ICCV 2025arXiv:2503.23219

citations

#6538

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Yusong Hu, Zichen Liang, Fei Yang et al.

CVPR 2025highlightarXiv:2503.21076

citations

#6539

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.

CVPR 2025highlightarXiv:2503.12127

citations

#6540

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

Yan Wang, Baoxiong Jia, Ziyu Zhu et al.

CVPR 2025arXiv:2504.19500

citations

#6541

Point Clouds Meets Physics: Dynamic Acoustic Field Fitting Network for Point Cloud Understanding

Changshuo Wang, Shuting He, Xiang Fang et al.

CVPR 2025

citations

#6542

High-Dimensional Calibration from Swap Regret

Maxwell Fishelson, Noah Golowich, Mehryar Mohri et al.

NEURIPS 2025oralarXiv:2505.21460

citations

#6543

Learning with Calibration: Exploring Test-Time Computing of Spatio-Temporal Forecasting

Wei Chen, Yuxuan Liang

NEURIPS 2025oralarXiv:2506.00635

citations

#6544

Visual Persona: Foundation Model for Full-Body Human Customization

Jisu Nam, Soowon Son, Zhan Xu et al.

CVPR 2025arXiv:2503.15406

citations

#6545

COME: Adding Scene-Centric Forecasting Control to Occupancy World Model

Yining Shi, Kun Jiang, Qiang Meng et al.

NEURIPS 2025oralarXiv:2506.13260

citations

#6546

Improved Representation Steering for Language Models

Zhengxuan Wu, Qinan Yu, Aryaman Arora et al.

NEURIPS 2025spotlightarXiv:2505.20809

citations

#6547

Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Fengfan Zhou, Bangjie Yin, Hefei Ling et al.

CVPR 2025arXiv:2411.15555

citations

#6548

A Comprehensive Study of Decoder-Only LLMs for Text-to-Image Generation

Andrew Z Wang, Songwei Ge, Tero Karras et al.

CVPR 2025arXiv:2506.08210

citations

#6549

HQ-CLIP: Leveraging Large Vision-Language Models to Create High-Quality Image-Text Datasets and CLIP Models

ZHIXIANG WEI, Guangting Wang, Xiaoxiao Ma et al.

ICCV 2025arXiv:2507.22431

citations

#6550

Fast Inference for Augmented Large Language Models

Rana Shahout, Cong Liang, Shiji Xin et al.

NEURIPS 2025arXiv:2410.18248

citations

#6551

Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization

Guanchen Li, Yixing Xu, Zeping Li et al.

NEURIPS 2025arXiv:2503.09657

citations

#6552

R-LiViT: A LiDAR-Visual-Thermal Dataset Enabling Vulnerable Road User Focused Roadside Perception

Jonas Mirlach, Lei Wan, Andreas Wiedholz et al.

ICCV 2025arXiv:2503.17122

citations

#6553

LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers

Yusuf Dalva, Hidir Yesiltepe, Pinar Yanardag

NEURIPS 2025spotlightarXiv:2505.23758

citations

#6554

MIRE: Matched Implicit Neural Representations

Dhananjaya Jayasundara, Heng Zhao, Demetrio Labate et al.

CVPR 2025

citations

#6555

U-Know-DiffPAN: An Uncertainty-aware Knowledge Distillation Diffusion Framework with Details Enhancement for PAN-Sharpening

Sungpyo Kim, Jeonghyeok Do, Jaehyup Lee et al.

CVPR 2025arXiv:2412.06243

citations

#6556

Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers

Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.

NEURIPS 2025

citations

#6557

Realistic Test-Time Adaptation of Vision-Language Models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer et al.

CVPR 2025highlightarXiv:2501.03729

citations

#6558

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Vishnu Sarukkai, Zhiqiang Xie, Kayvon Fatahalian

NEURIPS 2025arXiv:2505.00234

citations

#6559

MATCHA: Towards Matching Anything

Fei Xue, Sven Elflein, Laura Leal-Taixe et al.

CVPR 2025highlight

citations

#6560

Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning

Julian Minder, Clément Dumas, Caden Juang et al.

NEURIPS 2025arXiv:2504.02922

citations

#6561

Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training

Will Merrill, Shane Arora, Dirk Groeneveld et al.

NEURIPS 2025spotlightarXiv:2505.23971

citations

#6562

On scalable and efficient training of diffusion samplers

Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.

NEURIPS 2025arXiv:2505.19552

citations

#6563

T2V-OptJail: Discrete Prompt Optimization for Text-to-Video Jailbreak Attacks

Jiayang Liu, Siyuan Liang, Shiqian Zhao et al.

NEURIPS 2025arXiv:2505.06679

citations

#6564

Zero-Shot Image Restoration Using Few-Step Guidance of Consistency Models (and Beyond)

Tomer Garber, Tom Tirer

CVPR 2025arXiv:2412.20596

citations

#6565

CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

Wonseok Roh, Hwanhee Jung, JongWook Kim et al.

ICCV 2025arXiv:2412.12906

citations

#6566

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning

Zhi Jing, Siyuan Yang, Jicong Ao et al.

NEURIPS 2025arXiv:2507.00833

citations

#6567

GaussianVideo: Efficient Video Representation via Hierarchical Gaussian Splatting

Andrew Bond, Jui-Hsien Wang, Long Mai et al.

ICCV 2025arXiv:2501.04782

citations

#6568

MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems

Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.

NEURIPS 2025oralarXiv:2505.18943

citations

#6569

PEER Pressure: Model-to-Model Regularization for Single Source Domain Generalization

Dongkyu Cho, Inwoo Hwang, Sanghack Lee

CVPR 2025arXiv:2505.12745

citations

#6570

ZeroStereo: Zero-shot Stereo Matching from Single Images

Xianqi Wang, Hao Yang, Gangwei Xu et al.

ICCV 2025arXiv:2501.08654

citations

#6571

Unified Dense Prediction of Video Diffusion

Lehan Yang, Lu Qi, Xiangtai Li et al.

CVPR 2025arXiv:2503.09344

citations

#6572

ARKit LabelMaker: A New Scale for Indoor 3D Scene Understanding

Guangda Ji, Silvan Weder, Francis Engelmann et al.

CVPR 2025arXiv:2410.13924

citations

#6573

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model

Mingju Gao, Yike Pan, Huan-ang Gao et al.

CVPR 2025arXiv:2503.19913

citations

#6574

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

Siwei Tu, Ben Fei, Weidong Yang et al.

CVPR 2025highlightarXiv:2502.07814

citations

#6575

Optimal Spectral Transitions in High-Dimensional Multi-Index Models

Leonardo Defilippis, Yatin Dandi, Pierre Mergny et al.

NEURIPS 2025arXiv:2502.02545

citations

#6576

Driving View Synthesis on Free-form Trajectories with Generative Prior

Zeyu Yang, Zijie Pan, Yuankun Yang et al.

ICCV 2025arXiv:2412.01717

citations

#6577

Uniform Generalization Bounds on Data-Dependent Hypothesis Sets via PAC-Bayesian Theory on Random Sets

Benjamin Dupuis, Paul Viallard, George Deligiannidis et al.

NEURIPS 2025arXiv:2404.17442

citations

#6578

Nonisotropic Gaussian Diffusion for Realistic 3D Human Motion Prediction

Cecilia Curreli, Dominik Muhle, Abhishek Saroha et al.

CVPR 2025arXiv:2501.06035

citations

#6579

Chat-based Person Retrieval via Dialogue-Refined Cross-Modal Alignment

Yang Bai, Yucheng Ji, Min Cao et al.

CVPR 2025

citations

#6580

AlphaPre: Amplitude-Phase Disentanglement Model for Precipitation Nowcasting

Kenghong Lin, Baoquan Zhang, Demin Yu et al.

CVPR 2025

citations

#6581

Distilling Parallel Gradients for Fast ODE Solvers of Diffusion Models

Beier Zhu, Ruoyu Wang, Tong Zhao et al.

ICCV 2025arXiv:2507.14797

citations

#6582

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

Weinan Jia, Mengqi Huang, Nan Chen et al.

CVPR 2025

citations

#6583

Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

Marwa Abdulhai, Ryan Cheng, Donovan Clay et al.

NEURIPS 2025arXiv:2511.00222

citations

#6584

Learning to Think: Information-Theoretic Reinforcement Fine-Tuning for LLMs

Jingyao Wang, Wenwen Qiang, Zeen Song et al.

NEURIPS 2025arXiv:2505.10425

citations

#6585

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Hao Kang, Qingru Zhang, Han Cai et al.

NEURIPS 2025spotlightarXiv:2505.19481

citations

#6586

MIEB: Massive Image Embedding Benchmark

Chenghao Xiao, Isaac Chung, Imene Kerboua et al.

ICCV 2025arXiv:2504.10471

citations

#6587

AnyCalib: On-Manifold Learning for Model-Agnostic Single-View Camera Calibration

Javier Tirado-Garín, Javier Civera

ICCV 2025arXiv:2503.12701

citations

#6588

Time Series Generation Under Data Scarcity: A Unified Generative Modeling Approach

Tal Gonen, Itai Pemper, Ilan Naiman et al.

NEURIPS 2025oralarXiv:2505.20446

citations

#6589

Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking

Changlun Li, Yao SHI, Chen Wang et al.

NEURIPS 2025arXiv:2505.11065

citations

#6590

Token Perturbation Guidance for Diffusion Models

Javad Rajabi, Soroush Mehraban, Seyedmorteza Sadat et al.

NEURIPS 2025arXiv:2506.10036

citations

#6591

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

Liliang Ren, Congcong Chen, Haoran Xu et al.

NEURIPS 2025arXiv:2507.06607

citations

#6592

MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking

Xinqi Liu, Li Zhou, Zikun Zhou et al.

CVPR 2025highlightarXiv:2411.15459

citations

#6593

The Hawthorne Effect in Reasoning Models: Evaluating and Steering Test Awareness

Sahar Abdelnabi, Ahmed Salem

NEURIPS 2025spotlightarXiv:2505.14617

citations

#6594

Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation

Shivam Duggal, Yushi Hu, Oscar Michel et al.

CVPR 2025arXiv:2504.18509

citations

#6595

Manipulating Feature Visualizations with Gradient Slingshots

Dilyara Bareeva, Marina Höhne, Alexander Warnecke et al.

NEURIPS 2025arXiv:2401.06122

citations

#6596

Effective Cloud Removal for Remote Sensing Images by an Improved Mean-Reverting Denoising Model with Elucidated Design Space

Yi Liu, Wengen Li, Jihong Guan et al.

CVPR 2025arXiv:2503.23717

citations

#6597

Improving Gaussian Splatting with Localized Points Management

Haosen Yang, Chenhao Zhang, Wenqing Wang et al.

CVPR 2025highlightarXiv:2406.04251

citations

#6598

DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection

Jaewoo Song, Daemin Park, Kanghyun Baek et al.

CVPR 2025highlightarXiv:2503.13985

citations

#6599

InteractionMap: Improving Online Vectorized HDMap Construction with Interaction

Kuang Wu, Chuan Yang, Zhanbin Li

CVPR 2025arXiv:2503.21659

citations

#6600

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

Haisheng Su, Feixiang Song, CONG MA et al.

CVPR 2025arXiv:2408.15503

citations

← Previous

1...31 32 33 34 35...112