Most Cited 2025 "reinforcement learning exploration" Papers

22,274 papers found • Page 21 of 112

#4001

UniCoTT: A Unified Framework for Structural Chain-of-Thought Distillation

Xianwei Zhuang, Zhihong Zhu, Zhichang Wang et al.

ICLR 2025poster
7
citations
#4002

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025posterarXiv:2503.16707
7
citations
#4003

Predicting the Original Appearance of Damaged Historical Documents

Zhenhua Yang, Dezhi Peng, Yongxin Shi et al.

AAAI 2025paperarXiv:2412.11634
7
citations
#4004

KinMo: Kinematic-aware Human Motion Understanding and Generation

Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.

ICCV 2025posterarXiv:2411.15472
7
citations
#4005

Instruction-guided Multi-Granularity Segmentation and Captioning with Large Multimodal Model

Xu Yuan, Li Zhou, Zenghui Sun et al.

AAAI 2025paperarXiv:2409.13407
7
citations
#4006

Selective Prompt Anchoring for Code Generation

Yuan Tian, Tianyi Zhang

ICML 2025posterarXiv:2408.09121
7
citations
#4007

SMITE: Segment Me In TimE

Amirhossein Alimohammadi, Sauradip Nag, Saeid Asgari et al.

ICLR 2025posterarXiv:2410.18538
7
citations
#4008

DanceFix: An Exploration in Group Dance Neatness Assessment Through Fixing Abnormal Challenges of Human Pose

Huangbiao Xu, Xiao Ke, Huanqi Wu et al.

AAAI 2025paper
7
citations
#4009

Effective and Efficient Time-Varying Counterfactual Prediction with State-Space Models

Haotian Wang, Haoxuan Li, Hao Zou et al.

ICLR 2025poster
7
citations
#4010

Glauber Generative Model: Discrete Diffusion Models via Binary Classification

Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam

ICLR 2025posterarXiv:2405.17035
7
citations
#4011

Details Enhancement in Unsigned Distance Field Learning for High-fidelity 3D Surface Reconstruction

Cheng Xu, Fei Hou, Wencheng Wang et al.

AAAI 2025paperarXiv:2406.00346
7
citations
#4012

Scene Map-based Prompt Tuning for Navigation Instruction Generation

Sheng Fan, Rui Liu, Wenguan Wang et al.

CVPR 2025poster
7
citations
#4013

Understanding Fairness Surrogate Functions in Algorithmic Fairness

Yong Liu, (Andrew) Zhanke Zhou, Zhicong Li et al.

ICLR 2025posterarXiv:2310.11211
7
citations
#4014

Privacy amplification by random allocation

Moshe Shenfeld, Vitaly Feldman

NEURIPS 2025spotlightarXiv:2502.08202
7
citations
#4015

What Do Latent Action Models Actually Learn?

Chuheng Zhang, Tim Pearce, Pushi Zhang et al.

NEURIPS 2025posterarXiv:2506.15691
7
citations
#4016

Towards Robustness and Explainability of Automatic Algorithm Selection

Xingyu Wu, Jibin Wu, Yu Zhou et al.

ICML 2025spotlight
7
citations
#4017

BRAID: Input-driven Nonlinear Dynamical Modeling of Neural-Behavioral Data

Parsa Vahidi, Omid G. Sani, Maryam Shanechi

ICLR 2025oralarXiv:2509.18627
7
citations
#4018

CVLUE: A New Benchmark Dataset for Chinese Vision-Language Understanding Evaluation

Yuxuan Wang, Yijun Liu, Fei Yu et al.

AAAI 2025paperarXiv:2407.01081
7
citations
#4019

Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation

adil kaan akan, Yucel Yemez

ICLR 2025posterarXiv:2501.15878
7
citations
#4020

Alignment-Free RGB-T Salient Object Detection: A Large-Scale Dataset and Progressive Correlation Network

Kunpeng Wang, Keke Chen, Chenglong Li et al.

AAAI 2025paperarXiv:2412.14576
7
citations
#4021

GS-LIVM: Real-Time Photo-Realistic LiDAR-Inertial-Visual Mapping with Gaussian Splatting

Yusen XIE, Zhenmin Huang, Jin Wu et al.

ICCV 2025posterarXiv:2410.17084
7
citations
#4022

Privacy Attacks on Image AutoRegressive Models

Antoni Kowalczuk, Jan Dubiński, Franziska Boenisch et al.

ICML 2025posterarXiv:2502.02514
7
citations
#4023

PALMBENCH: A COMPREHENSIVE BENCHMARK OF COMPRESSED LARGE LANGUAGE MODELS ON MOBILE PLATFORMS

Yilong Li, Jingyu Liu, Hao Zhang et al.

ICLR 2025posterarXiv:2410.05315
7
citations
#4024

Doubly Robust Conformalized Survival Analysis with Right-Censored Data

Matteo Sesia, vladimir svetnik

ICML 2025spotlightarXiv:2412.09729
7
citations
#4025

CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning

Jiangpeng He, Zhihao Duan, Fengqing Zhu

CVPR 2025posterarXiv:2505.24816
7
citations
#4026

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation

Xie Tianyidan, Rui Ma, Qian Wang et al.

AAAI 2025paperarXiv:2404.18598
7
citations
#4027

LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields

Zhengqin Li, Dilin Wang, Ka chen et al.

CVPR 2025posterarXiv:2504.20026
7
citations
#4028

EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark

Ming Li, Jike Zhong, Tianle Chen et al.

CVPR 2025posterarXiv:2411.01492
7
citations
#4029

Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion

Zexin He, Tengfei Wang, Xin Huang et al.

CVPR 2025posterarXiv:2412.09593
7
citations
#4030

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.

CVPR 2025posterarXiv:2507.06928
7
citations
#4031

Learning Safety Constraints for Large Language Models

Xin Chen, Yarden As, Andreas Krause

ICML 2025spotlightarXiv:2505.24445
7
citations
#4032

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Ulyana Piterbarg, Lerrel Pinto, Rob Fergus

ICLR 2025posterarXiv:2410.02749
7
citations
#4033

ARIG: Autoregressive Interactive Head Generation for Real-time Conversations

Ying Guo, Xi Liu, Cheng Zhen et al.

ICCV 2025posterarXiv:2507.00472
7
citations
#4034

Contextual AD Narration with Interleaved Multimodal Sequence

Hanlin Wang, Zhan Tong, Kecheng Zheng et al.

CVPR 2025posterarXiv:2403.12922
7
citations
#4035

Beyond Human Data: Aligning Multimodal Large Language Models by Iterative Self-Evolution

Wentao Tan, Qiong Cao, Yibing Zhan et al.

AAAI 2025paperarXiv:2412.15650
7
citations
#4036

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

Yudong Liu, Jingwei Sun, Yueqian Lin et al.

ICCV 2025posterarXiv:2503.10742
7
citations
#4037

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Yuze He, Yanning Zhou, Wang Zhao et al.

CVPR 2025posterarXiv:2411.05738
7
citations
#4038

Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment

Yaling Shen, Zhixiong Zhuang, Kun Yuan et al.

AAAI 2025paperarXiv:2502.02438
7
citations
#4039

Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data

David Heurtel-Depeiges, Anian Ruoss, Joel Veness et al.

ICML 2025posterarXiv:2410.05078
7
citations
#4040

CrossOver: 3D Scene Cross-Modal Alignment

Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys et al.

CVPR 2025highlightarXiv:2502.15011
7
citations
#4041

A Simple yet Effective Layout Token in Large Language Models for Document Understanding

Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.

CVPR 2025posterarXiv:2503.18434
7
citations
#4042

Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Zihan Yu, Jingtao Ding, Yong Li et al.

ICLR 2025posterarXiv:2411.03753
7
citations
#4043

GFlowVLM: Enhancing Multi-step Reasoning in Vision-Language Models with Generative Flow Networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta et al.

CVPR 2025posterarXiv:2503.06514
7
citations
#4044

Impossible Videos

Zechen Bai, Hai Ci, Mike Zheng Shou

ICML 2025oralarXiv:2503.14378
7
citations
#4045

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Yanghao Wang, Long Chen

CVPR 2025posterarXiv:2408.16266
7
citations
#4046

Second Order Bounds for Contextual Bandits with Function Approximation

Aldo Pacchiano

ICLR 2025posterarXiv:2409.16197
7
citations
#4047

AnyDressing: Customizable Multi-Garment Virtual Dressing via Latent Diffusion Models

Xinghui Li, Qichao Sun, Pengze Zhang et al.

CVPR 2025posterarXiv:2412.04146
7
citations
#4048

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

Namgyu Kang, Jaemin Oh, Youngjoon Hong et al.

ICLR 2025posterarXiv:2412.05994
7
citations
#4049

Segment Any 3D Object with Language

Seungjun Lee, Yuyang Zhao, Gim H Lee

ICLR 2025posterarXiv:2404.02157
7
citations
#4050

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

Liang Chen, Sinan Tan, Zefan Cai et al.

ICLR 2025posterarXiv:2410.01912
7
citations
#4051

Beyond the Frame: Generating 360° Panoramic Videos from Perspective Videos

Rundong Luo, Matthew Wallingford, Ali Farhadi et al.

ICCV 2025posterarXiv:2504.07940
7
citations
#4052

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040
7
citations
#4053

SwiftTry: Fast and Consistent Video Virtual Try-On with Diffusion Models

Hung Nguyen, Quang Qui-Vinh Nguyen, Khoi Nguyen et al.

AAAI 2025paperarXiv:2412.10178
7
citations
#4054

Loss Functions and Operators Generated by f-Divergences

Vincent Roulet, Tianlin Liu, Nino Vieillard et al.

ICML 2025posterarXiv:2501.18537
7
citations
#4055

ESE: Espresso Sentence Embeddings

Xianming Li, Zongxi Li, Jing Li et al.

ICLR 2025poster
7
citations
#4056

Motion-aware Contrastive Learning for Temporal Panoptic Scene Graph Generation

Thong Thanh Nguyen, Xiaobao Wu, Yi Bin et al.

AAAI 2025paperarXiv:2412.07160
7
citations
#4057

Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model

Shengjun Zhang, Jinzhao Li, Xin Fei et al.

CVPR 2025posterarXiv:2504.02764
7
citations
#4058

Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics

Lee Chae-Yeon, Oh Hyun-Bin, Han EunGi et al.

CVPR 2025highlightarXiv:2503.20308
7
citations
#4059

Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations

Richard Bergna, Sergio Calvo Ordoñez, Felix Opolka et al.

ICLR 2025posterarXiv:2408.16115
7
citations
#4060

Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training

Qiaosi Yi, Shuai Li, Rongyuan Wu et al.

ICCV 2025highlightarXiv:2507.20291
7
citations
#4061

GenesisTex2: Stable, Consistent and High-Quality Text-to-Texture Generation

Jiawei Lu, YingPeng Zhang, Zengjun Zhao et al.

AAAI 2025paperarXiv:2409.18401
7
citations
#4062

Emergence and Evolution of Interpretable Concepts in Diffusion Models

Berk Tinaz, Zalan Fabian, Mahdi Soltanolkotabi

NEURIPS 2025spotlightarXiv:2504.15473
7
citations
#4063

ChatHuman: Chatting about 3D Humans with Tools

Jing Lin, Yao Feng, Weiyang Liu et al.

CVPR 2025posterarXiv:2405.04533
7
citations
#4064

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Chengyou Jia, Changliang Xia, Zhuohang Dang et al.

CVPR 2025posterarXiv:2411.17176
7
citations
#4065

FreSh: Frequency Shifting for Accelerated Neural Representation Learning

Adam Kania, Marko Mihajlovic, Sergey Prokudin et al.

ICLR 2025posterarXiv:2410.05050
7
citations
#4066

Scaling Laws for Task-Optimized Models of the Primate Visual Ventral Stream

Abdulkadir Gokce, Martin Schrimpf

ICML 2025oralarXiv:2411.05712
7
citations
#4067

Overcoming the Curse of Dimensionality in Reinforcement Learning Through Approximate Factorization

Chenbei Lu, Laixi Shi, Zaiwei Chen et al.

ICML 2025posterarXiv:2411.07591
7
citations
#4068

HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes

Xin Lin, Shi Luo, Xiaojun Shan et al.

ICLR 2025poster
7
citations
#4069

Perception in Reflection

Yana Wei, Liang Zhao, Kangheng Lin et al.

ICML 2025posterarXiv:2504.07165
7
citations
#4070

Arbitrary Reading Order Scene Text Spotter with Local Semantics Guidance

Jiahao Lyu, Wei Wang, Dongbao Yang et al.

AAAI 2025paperarXiv:2412.10159
7
citations
#4071

Robust and Conjugate Spatio-Temporal Gaussian Processes

William Laplante, Matias Altamirano, Andrew Duncan et al.

ICML 2025oralarXiv:2502.02450
7
citations
#4072

Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map

Xinyuan Chang, Maixuan Xue, Xinran Liu et al.

CVPR 2025highlightarXiv:2410.23780
7
citations
#4073

SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes

Yuji Wang, Haoran Xu, Yong Liu et al.

CVPR 2025posterarXiv:2506.01558
7
citations
#4074

Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy

Jie Ren, Zhenwei Dai, Xianfeng Tang et al.

NEURIPS 2025posterarXiv:2506.00359
7
citations
#4075

Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion

Vitor Guizilini, Muhammad Zubair Irshad, Dian Chen et al.

CVPR 2025posterarXiv:2501.18804
7
citations
#4076

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

Dominick Reilly, Rajatsubhra Chakraborty, Arkaprava Sinha et al.

CVPR 2025posterarXiv:2406.09390
7
citations
#4077

RobustMerge: Parameter-Efficient Model Merging for MLLMs with Direction Robustness

Fanhu Zeng, Haiyang Guo, Fei Zhu et al.

NEURIPS 2025spotlightarXiv:2502.17159
7
citations
#4078

Pursuing Feature Separation based on Neural Collapse for Out-of-Distribution Detection

Yingwen Wu, Ruiji Yu, Xinwen Cheng et al.

ICLR 2025posterarXiv:2405.17816
7
citations
#4079

Ultra-Resolution Adaptation with Ease

Ruonan Yu, Songhua Liu, Zhenxiong Tan et al.

ICML 2025posterarXiv:2503.16322
7
citations
#4080

TR-PTS: Task-Relevant Parameter and Token Selection for Efficient Tuning

Siqi Luo, Haoran Yang, Yi Xin et al.

ICCV 2025posterarXiv:2507.22872
7
citations
#4081

SkySense V2: A Unified Foundation Model for Multi-modal Remote Sensing

Yingying Zhang, Lixiang Ru, Kang Wu et al.

ICCV 2025posterarXiv:2507.13812
7
citations
#4082

Noise Modeling in One Hour: Minimizing Preparation Efforts for Self-supervised Low-Light RAW Image Denoising

Feiran Li, Haiyang Jiang, Daisuke Iso

CVPR 2025posterarXiv:2505.00045
7
citations
#4083

SEAL: Semantic Attention Learning for Long Video Representation

Lan Wang, Yujia Chen, Wen-Sheng Chu et al.

CVPR 2025posterarXiv:2412.01798
7
citations
#4084

POp-GS: Next Best View in 3D-Gaussian Splatting with P-Optimality

Joey Wilson, Marcelino M. de Almeida, Sachit Mahajan et al.

CVPR 2025posterarXiv:2503.07819
7
citations
#4085

COLUMBUS: Evaluating COgnitive Lateral Understanding Through Multiple-Choice reBUSes

Koen Kraaijveld, Yifan Jiang, Kaixin Ma et al.

AAAI 2025paperarXiv:2409.04053
7
citations
#4086

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

Shuoyan Wei, Feng Li, Shengeng Tang et al.

CVPR 2025highlightarXiv:2505.04657
7
citations
#4087

PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

Pablo Lemos, Sammy Sharief, Nikolay Malkin et al.

ICLR 2025posterarXiv:2402.04355
7
citations
#4088

SEMU: Singular Value Decomposition for Efficient Machine Unlearning

Marcin Sendera, Łukasz Struski, Kamil Książek et al.

ICML 2025posterarXiv:2502.07587
7
citations
#4089

Value-Guided Search for Efficient Chain-of-Thought Reasoning

Kaiwen Wang, Jin Zhou, Jonathan Chang et al.

NEURIPS 2025posterarXiv:2505.17373
7
citations
#4090

CausalRivers - Scaling up benchmarking of causal discovery for real-world time-series

Gideon Stein, Maha Shadaydeh, Jan Blunk et al.

ICLR 2025oralarXiv:2503.17452
7
citations
#4091

Generating Freeform Endoskeletal Robots

Muhan Li, Lingji Kong, Sam Kriegman

ICLR 2025posterarXiv:2412.01036
7
citations
#4092

NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI

Cosmin Bercea, Jun Li, Philipp Raffler et al.

NEURIPS 2025oral
7
citations
#4093

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Junjie Wang, BIN CHEN, Yulin Li et al.

CVPR 2025posterarXiv:2505.04410
7
citations
#4094

CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension

Rui Li, Zeyu Zhang, Xiaohe Bo et al.

NEURIPS 2025posterarXiv:2510.05520
7
citations
#4095

Doubly Contrastive Learning for Source-Free Domain Adaptive Person Search

Yizhen Jia, Rong Quan, Yue Feng et al.

AAAI 2025paper
7
citations
#4096

DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers

Li Ren, Chen Chen, Liqiang Wang et al.

CVPR 2025posterarXiv:2505.23694
7
citations
#4097

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Zichen Liu, Yihao Meng, Hao Ouyang et al.

ICCV 2025posterarXiv:2404.11614
7
citations
#4098

3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations

yating wang, Xuan Wang, Ran Yi et al.

CVPR 2025posterarXiv:2504.14967
7
citations
#4099

VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models

Chi-Pin Huang, Yen-Siang Wu, Hung-Kai Chung et al.

CVPR 2025posterarXiv:2503.21781
7
citations
#4100

PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer

Pierre-David Letourneau, Manish Singh, Hsin-Pai Cheng et al.

ICLR 2025posterarXiv:2407.11306
7
citations
#4101

HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene

Jianing Chen, Zehao Li, Yujun Cai et al.

NEURIPS 2025oralarXiv:2506.09518
7
citations
#4102

DuMo: Dual Encoder Modulation Network for Precise Concept Erasure

Feng Han, Kai Chen, Chao Gong et al.

AAAI 2025paperarXiv:2501.01125
7
citations
#4103

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Michal Nauman, Marek Cygan, Carmelo Sferrazza et al.

NEURIPS 2025oralarXiv:2505.23150
7
citations
#4104

NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains

Wonje Choi, Jinwoo Park, Sanghyun Ahn et al.

ICLR 2025posterarXiv:2503.00870
7
citations
#4105

Extrapolated Urban View Synthesis Benchmark

Xiangyu Han, Zhen Jia, Boyi Li et al.

ICCV 2025posterarXiv:2412.05256
7
citations
#4106

GauSTAR: Gaussian Surface Tracking and Reconstruction

Chengwei Zheng, Lixin Xue, Juan Jose Zarate et al.

CVPR 2025posterarXiv:2501.10283
7
citations
#4107

Multi-Perspective Data Augmentation for Few-shot Object Detection

Anh-Khoa Nguyen Vu, Quoc Truong Truong, Vinh-Tiep Nguyen et al.

ICLR 2025posterarXiv:2502.18195
7
citations
#4108

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

Chengyue Huang, Junjiao Tian, Brisa Maneechotesuwan et al.

ICLR 2025posterarXiv:2502.15895
7
citations
#4109

Robustness Auditing for Linear Regression: To Singularity and Beyond

Ittai Rubinstein, Samuel Hopkins

ICLR 2025posterarXiv:2410.07916
7
citations
#4110

CTSyn: A Foundation Model for Cross Tabular Data Generation

Xiaofeng Lin, Chenheng Xu, Matthew Yang et al.

ICLR 2025posterarXiv:2406.04619
7
citations
#4111

MAR-3D: Progressive Masked Auto-regressor for High-Resolution 3D Generation

Jinnan Chen, Lingting Zhu, Zeyu HU et al.

CVPR 2025highlightarXiv:2503.20519
7
citations
#4112

Exploring Historical Information for RGBE Visual Tracking with Mamba

Chuanyu Sun, Jiqing Zhang, Yang Wang et al.

CVPR 2025poster
7
citations
#4113

Triples as the Key: Structuring Makes Decomposition and Verification Easier in LLM-based TableQA

Zhen Yang, Ziwei Du, Minghan Zhang et al.

ICLR 2025poster
7
citations
#4114

FlowR: Flowing from Sparse to Dense 3D Reconstructions

Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.

ICCV 2025highlightarXiv:2504.01647
7
citations
#4115

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

Zhengrui Guo, Conghao Xiong, Jiabo MA et al.

CVPR 2025posterarXiv:2411.14743
7
citations
#4116

GaussRender: Learning 3D Occupancy with Gaussian Rendering

Loick Chambon, Eloi Zablocki, Alexandre Boulch et al.

ICCV 2025posterarXiv:2502.05040
7
citations
#4117

AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi

ICML 2025posterarXiv:2411.17284
7
citations
#4118

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.

NEURIPS 2025posterarXiv:2505.24878
7
citations
#4119

VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

Hyojun Go, Byeongjun Park, Hyelin Nam et al.

ICCV 2025posterarXiv:2503.15855
7
citations
#4120

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Hyojin Bahng, Caroline Chan, Fredo Durand et al.

ICCV 2025posterarXiv:2506.02095
7
citations
#4121

ActionReasoningBench: Reasoning about Actions with and without Ramification Constraints

Divij Handa, Pavel Dolin, Shrinidhi Kumbhar et al.

ICLR 2025posterarXiv:2406.04046
7
citations
#4122

Position: We Need An Algorithmic Understanding of Generative AI

Oliver Eberle, Thomas McGee, Hamza Giaffar et al.

ICML 2025spotlightarXiv:2507.07544
7
citations
#4123

Lifelong Safety Alignment for Language Models

Haoyu Wang, Yifei Zhao, Zeyu Qin et al.

NEURIPS 2025posterarXiv:2505.20259
7
citations
#4124

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Zhanyi Sun, Shuran Song

NEURIPS 2025spotlightarXiv:2508.05941
7
citations
#4125

Out of Length Text Recognition with Sub-String Matching

Yongkun Du, Zhineng Chen, Caiyan Jia et al.

AAAI 2025paperarXiv:2407.12317
7
citations
#4126

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Xiang Xu, Lingdong Kong, Song Wang et al.

ICCV 2025posterarXiv:2507.05260
7
citations
#4127

Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data

Yunhao Tang, Sid Wang, Lovish Madaan et al.

NEURIPS 2025posterarXiv:2503.19618
7
citations
#4128

Data-Juicer Sandbox: A Feedback-Driven Suite for Multimodal Data-Model Co-development

Daoyuan Chen, Haibin Wang, Yilun Huang et al.

ICML 2025spotlightarXiv:2407.11784
7
citations
#4129

RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors

Avinash Paliwal, xilong zhou, Wei Ye et al.

ICCV 2025posterarXiv:2503.10860
7
citations
#4130

Generative RLHF-V: Learning Principles from Multi-modal Human Preference

Jiayi Zhou, Jiaming Ji, Boyuan Chen et al.

NEURIPS 2025posterarXiv:2505.18531
7
citations
#4131

Straight-Line Diffusion Model for Efficient 3D Molecular Generation

Yuyan Ni, Shikun Feng, Haohan Chi et al.

NEURIPS 2025posterarXiv:2503.02918
7
citations
#4132

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025posterarXiv:2412.18928
7
citations
#4133

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Tuomas Oikarinen, Ge Yan, Lily Weng

ICML 2025posterarXiv:2506.05774
7
citations
#4134

Vision-Language Models Can't See the Obvious

YASSER ABDELAZIZ DAHOU DJILALI, Ngoc Huynh, Phúc Lê Khắc et al.

ICCV 2025posterarXiv:2507.04741
7
citations
#4135

MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines

Dongzhi Jiang, Renrui Zhang, Ziyu Guo et al.

ICLR 2025poster
7
citations
#4136

Training-Free Constrained Generation With Stable Diffusion Models

Stefano Zampini, Jacob K Christopher, Luca Oneto et al.

NEURIPS 2025spotlightarXiv:2502.05625
7
citations
#4137

DISCO: learning to DISCover an evolution Operator for multi-physics-agnostic prediction

Rudy Morel, Jiequn Han, Edouard Oyallon

ICML 2025oralarXiv:2504.19496
7
citations
#4138

Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration

Max Wilcoxson, Qiyang Li, Kevin Frans et al.

ICML 2025posterarXiv:2410.18076
7
citations
#4139

H-MBA: Hierarchical MamBa Adaptation for Multi-Modal Video Understanding in Autonomous Driving

Siran Chen, Yuxiao Luo, Yue Ma et al.

AAAI 2025paperarXiv:2501.04302
6
citations
#4140

Flowing Datasets with Wasserstein over Wasserstein Gradient Flows

Clément Bonet, Christophe Vauthier, Anna Korba

ICML 2025oralarXiv:2506.07534
6
citations
#4141

BrainACTIV: Identifying visuo-semantic properties driving cortical selectivity using diffusion-based image manipulation

Diego García Cerdas, Christina Sartzetaki, Magnus Petersen et al.

ICLR 2025poster
6
citations
#4142

Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?

Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen et al.

ICML 2025posterarXiv:2502.08991
6
citations
#4143

MATCHA: Towards Matching Anything

Fei Xue, Sven Elflein, Laura Leal-Taixe et al.

CVPR 2025highlight
6
citations
#4144

On the Relation between Rectified Flows and Optimal Transport

Johannes Hertrich, Antonin Chambolle, Julie Delon

NEURIPS 2025posterarXiv:2505.19712
6
citations
#4145

Text2Relight: Creative Portrait Relighting with Text Guidance

Junuk Cha, Mengwei Ren, Krishna Kumar Singh et al.

AAAI 2025paperarXiv:2412.13734
6
citations
#4146

MambaVLT: Time-Evolving Multimodal State Space Model for Vision-Language Tracking

Xinqi Liu, Li Zhou, Zikun Zhou et al.

CVPR 2025highlightarXiv:2411.15459
6
citations
#4147

WaterDiffusion: Learning a Prior-involved Unrolling Diffusion for Joint Underwater Saliency Detection and Visual Restoration

Laibin Chang, Yunke Wang, Longxiang Deng et al.

AAAI 2025paper
6
citations
#4148

DMWM: Dual-Mind World Model with Long-Term Imagination

Lingyi Wang, Rashed Shelim, Walid Saad et al.

NEURIPS 2025spotlightarXiv:2502.07591
6
citations
#4149

PROXSPARSE: REGULARIZED LEARNING OF SEMI-STRUCTURED SPARSITY MASKS FOR PRETRAINED LLMS

Hongyi Liu, Rajarshi Saha, Zhen Jia et al.

ICML 2025posterarXiv:2502.00258
6
citations
#4150

ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary

Zeqi Gu, Yin Cui, Max Li et al.

CVPR 2025posterarXiv:2506.00742
6
citations
#4151

Intrinsic Dimension Correlation: uncovering nonlinear connections in multimodal representations

Lorenzo Basile, Santiago Acevedo, Luca Bortolussi et al.

ICLR 2025posterarXiv:2406.15812
6
citations
#4152

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.

NEURIPS 2025posterarXiv:2412.01784
6
citations
#4153

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025posterarXiv:2506.01317
6
citations
#4154

ProtoArgNet: Interpretable Image Classification with Super-Prototypes and Argumentation

Hamed Ayoobi, Nico Potyka, Francesca Toni

AAAI 2025paperarXiv:2311.15438
6
citations
#4155

Understanding and Mitigating Memorization in Diffusion Models for Tabular Data

Zhengyu Fang, Zhimeng Jiang, Huiyuan Chen et al.

ICML 2025posterarXiv:2412.11044
6
citations
#4156

Dynamical Diffusion: Learning Temporal Dynamics with Diffusion Models

Xingzhuo Guo, Yu Zhang, Baixu Chen et al.

ICLR 2025oralarXiv:2503.00951
6
citations
#4157

DEALing with Image Reconstruction: Deep Attentive Least Squares

Mehrsa Pourya, Erich Kobler, Michael Unser et al.

ICML 2025posterarXiv:2502.04079
6
citations
#4158

DexGarmentLab: Dexterous Garment Manipulation Environment with Generalizable Policy

Yuran Wang, Ruihai Wu, Yue Chen et al.

NEURIPS 2025spotlightarXiv:2505.11032
6
citations
#4159

Exploit Your Latents: Coarse-Grained Protein Backmapping with Latent Diffusion Models

Rongchao Zhang, Yu Huang, Yiwei Lou et al.

AAAI 2025paper
6
citations
#4160

ELICIT: LLM Augmentation Via External In-context Capability

Futing Wang, Jianhao (Elliott) Yan, Yue Zhang et al.

ICLR 2025posterarXiv:2410.09343
6
citations
#4161

UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting

Ziyi Wang, Yanran Zhang, Jie Zhou et al.

CVPR 2025posterarXiv:2506.09952
6
citations
#4162

Understanding the Limits of Deep Tabular Methods with Temporal Shift

Haorun Cai, Han-Jia Ye

ICML 2025oralarXiv:2502.20260
6
citations
#4163

Always Skip Attention

Yiping Ji, Hemanth Saratchandran, Peyman Moghadam et al.

ICCV 2025posterarXiv:2505.01996
6
citations
#4164

Tracing the Representation Geometry of Language Models from Pretraining to Post-training

Melody Li, Kumar Krishna Agrawal, Arna Ghosh et al.

NEURIPS 2025posterarXiv:2509.23024
6
citations
#4165

The Persistence of Neural Collapse Despite Low-Rank Bias

Connall Garrod, Jonathan Keating

NEURIPS 2025posterarXiv:2410.23169
6
citations
#4166

Mask in the Mirror: Implicit Sparsification

Tom Jacobs, Rebekka Burkholz

ICLR 2025posterarXiv:2408.09966
6
citations
#4167

PhysAug: A Physical-guided and Frequency-based Data Augmentation for Single-Domain Generalized Object Detection

Xiaoran Xu, Jiangang Yang, Wenhui Shi et al.

AAAI 2025paperarXiv:2412.11807
6
citations
#4168

Linear combinations of latents in generative models: subspaces and beyond

Erik Bodin, Alexandru Stere, Dragos Margineantu et al.

ICLR 2025posterarXiv:2408.08558
6
citations
#4169

DualCP: Rehearsal-Free Domain-Incremental Learning via Dual-Level Concept Prototype

Qiang Wang, Yuhang He, Songlin Dong et al.

AAAI 2025paperarXiv:2503.18042
6
citations
#4170

Point Clouds Meets Physics: Dynamic Acoustic Field Fitting Network for Point Cloud Understanding

Changshuo Wang, Shuting He, Xiang Fang et al.

CVPR 2025poster
6
citations
#4171

Are Expressive Models Truly Necessary for Offline RL?

Guan Wang, Haoyi Niu, Jianxiong Li et al.

AAAI 2025paperarXiv:2412.11253
6
citations
#4172

SUMI-IFL: An Information-Theoretic Framework for Image Forgery Localization with Sufficiency and Minimality Constraints

Ziqi Sheng, Wei Lu, Xiangyang Luo et al.

AAAI 2025paperarXiv:2412.09981
6
citations
#4173

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning

Zhi Jing, Siyuan Yang, Jicong Ao et al.

NEURIPS 2025posterarXiv:2507.00833
6
citations
#4174

MetaMind: Modeling Human Social Thoughts with Metacognitive Multi-Agent Systems

Xuanming Zhang, Yuxuan Chen, Samuel (Min-Hsuan) Yeh et al.

NEURIPS 2025oralarXiv:2505.18943
6
citations
#4175

CSformer: Combining Channel Independence and Mixing for Robust Multivariate Time Series Forecasting

Haoxin Wang, Yipeng Mo, Kunlan Xiang et al.

AAAI 2025paperarXiv:2312.06220
6
citations
#4176

Runtime Analysis for Multi-Objective Evolutionary Algorithms in Unbounded Integer Spaces

Benjamin Doerr, Martin S. Krejca, Günter Rudolph

AAAI 2025paperarXiv:2412.11684
6
citations
#4177

Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations

Pengcheng Jiang, Cao Xiao, Tianfan Fu et al.

AAAI 2025paperarXiv:2306.01631
6
citations
#4178

Týr-the-Pruner: Structural Pruning LLMs via Global Sparsity Distribution Optimization

Guanchen Li, Yixing Xu, Zeping Li et al.

NEURIPS 2025posterarXiv:2503.09657
6
citations
#4179

Locally Convex Global Loss Network for Decision-Focused Learning

Haeun Jeon, Hyunglip Bae, Minsu Park et al.

AAAI 2025paperarXiv:2403.01875
6
citations
#4180

LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Jian Jin, Zhenbo Yu, Yang Shen et al.

CVPR 2025highlightarXiv:2503.06956
6
citations
#4181

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Jingjing Hu, Dan Guo, Zhan Si et al.

AAAI 2025paperarXiv:2412.16483
6
citations
#4182

A Unified Model for Compressed Sensing MRI Across Undersampling Patterns

Armeet Singh Jatyani, Jiayun Wang, Aditi Chandrashekar et al.

CVPR 2025posterarXiv:2410.16290
6
citations
#4183

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Yunheng Li, Jing Cheng, Shaoyong Jia et al.

NEURIPS 2025oralarXiv:2509.18056
6
citations
#4184

GIViC: Generative Implicit Video Compression

Ge Gao, Siyue Teng, Tianhao Peng et al.

ICCV 2025posterarXiv:2503.19604
6
citations
#4185

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Han Lin, Jaemin Cho, Amir Zadeh et al.

NEURIPS 2025posterarXiv:2508.05954
6
citations
#4186

Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning

Yang You, Yixin Li, Congyue Deng et al.

ICLR 2025posterarXiv:2411.19458
6
citations
#4187

Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

Runchuan Zhu, Zhipeng Ma, Jiang Wu et al.

AAAI 2025paperarXiv:2410.06913
6
citations
#4188

Dense SAE Latents Are Features, Not Bugs

Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.

NEURIPS 2025posterarXiv:2506.15679
6
citations
#4189

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, Yaofu LIU et al.

ICCV 2025posterarXiv:2504.03140
6
citations
#4190

TaskGalaxy: Scaling Multi-modal Instruction Fine-tuning with Tens of Thousands Vision Task Types

Jiankang Chen, Tianke Zhang, Changyi Liu et al.

ICLR 2025posterarXiv:2502.09925
6
citations
#4191

Reconstructing Humans with a Biomechanically Accurate Skeleton

Yan Xia, Xiaowei Zhou, Etienne Vouga et al.

CVPR 2025posterarXiv:2503.21751
6
citations
#4192

Hypergraph Attacks via Injecting Homogeneous Nodes into Elite Hyperedges

Meixia He, Peican Zhu, Keke Tang et al.

AAAI 2025paperarXiv:2412.18365
6
citations
#4193

YOLO-Count: Differentiable Object Counting for Text-to-Image Generation

Guanning Zeng, Xiang Zhang, Zirui Wang et al.

ICCV 2025posterarXiv:2508.00728
6
citations
#4194

Aligning Language Models Using Follow-up Likelihood as Reward Signal

Chen Zhang, Dading Chong, Feng Jiang et al.

AAAI 2025paperarXiv:2409.13948
6
citations
#4195

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

Huanpeng Chu, Wei Wu, Guanyu Feng et al.

ICCV 2025posterarXiv:2508.16212
6
citations
#4196

Parameter Efficient Fine-tuning via Explained Variance Adaptation

Fabian Paischer, Lukas Hauzenberger, Thomas Schmied et al.

NEURIPS 2025posterarXiv:2410.07170
6
citations
#4197

Jailbreak-AudioBench: In-Depth Evaluation and Analysis of Jailbreak Threats for Large Audio Language Models

Hao Cheng, Erjia Xiao, Jing Shao et al.

NEURIPS 2025posterarXiv:2501.13772
6
citations
#4198

PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask

Jeongho Kim, Hoiyeong Jin, Sunghyun Park et al.

ICCV 2025posterarXiv:2412.16978
6
citations
#4199

Autocorrelation Matters: Understanding the Role of Initialization Schemes for State Space Models

Fusheng Liu, Qianxiao Li

ICLR 2025oralarXiv:2411.19455
6
citations
#4200

IMDPrompter: Adapting SAM to Image Manipulation Detection by Cross-View Automated Prompt Learning

Quan Zhang, Yuxin Qi, Xi Tang et al.

ICLR 2025posterarXiv:2502.02454
6
citations