NeurIPS Poster "reinforcement learning" Papers
50 papers found
$\texttt{G1}$: Teaching LLMs to Reason on Graphs with Reinforcement Learning
Xiaojun Guo, Ang Li, Yifei Wang et al.
A Differential and Pointwise Control Approach to Reinforcement Learning
Minh Nguyen, Chandrajit Bajaj
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications
Zhenyu Tao, Wei Xu, Xiaohu You
Continual Knowledge Adaptation for Reinforcement Learning
Jinwu Hu, ZiHao Lian, Zhiquan Wen et al.
Cypher-RI: Reinforcement Learning for Integrating Schema Selection into Cypher Generation
Hanchen Su, Xuyuan Li, Yan Zhou et al.
Delving into RL for Image Generation with CoT: A Study on DPO vs. GRPO
Chengzhuo Tong, Ziyu Guo, Renrui Zhang et al.
Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph
Mingxuan Ye, Jie Wang, Fangzhou et al.
Equilibrium Policy Generalization: A Reinforcement Learning Framework for Cross-Graph Zero-Shot Generalization in Pursuit-Evasion Games
Runyu Lu, Peng Zhang, Ruochuan Shi et al.
EvolvedGRPO: Unlocking Reasoning in LVLMs via Progressive Instruction Evolution
Zhebei Shen, Qifan Yu, Juncheng Li et al.
From Kolmogorov to Cauchy: Shallow XNet Surpasses KANs
Xin Li, Xiaotao Zheng, Zhihong Xia
Globally Optimal Policy Gradient Algorithms for Reinforcement Learning with PID Control Policies
Vipul Sharma, Wesley Suttle, S Sivaranjani
GoalLadder: Incremental Goal Discovery with Vision-Language Models
Alexey Zakharov, Shimon Whiteson
GraphChain: Large Language Models for Large-scale Graph Analysis via Tool Chaining
Chunyu Wei, Wenji Hu, Xingjia Hao et al.
HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving
Zhiwen Chen, Hanming Deng, Zhuoren Li et al.
Improving Monte Carlo Tree Search for Symbolic Regression
Zhengyao Huang, Daniel Huang, Tiannan Xiao et al.
Iterative Foundation Model Fine-Tuning on Multiple Rewards
Pouya M. Ghari, simone sciabola, Ye Wang
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
Kaihang Pan, Yang Wu, Wendong Bu et al.
MiCo: Multi-image Contrast for Reinforcement Visual Reasoning
Xi Chen, Mingkang Zhu, Shaoteng Liu et al.
Modelling the control of offline processing with reinforcement learning
Eleanor Spens, Neil Burgess, Tim Behrens
MRO: Enhancing Reasoning in Diffusion Language Models via Multi-Reward Optimization
Chenglong Wang, Yang Gan, Hang Zhou et al.
Multi-Agent Collaboration via Evolving Orchestration
Yufan Dang, Chen Qian, Xueheng Luo et al.
NoisyGRPO: Incentivizing Multimodal CoT Reasoning via Noise Injection and Bayesian Estimation
Longtian Qiu, Shan Ning, Jiaxuan Sun et al.
Online Estimation and Inference for Robust Policy Evaluation in Reinforcement Learning
Weidong Liu, Jiyuan Tu, Xi Chen et al.
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Jiacai Liu, Wenye Li, Dachao Lin et al.
Optimal Dynamic Regret by Transformers for Non-Stationary Reinforcement Learning
Baiyuan Chen, Shinji Ito, Masaaki Imaizumi
OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning
Alexandre Oliveira, Katarina Dyreby, Francisco Caldas et al.
Parameter Efficient Fine-tuning via Explained Variance Adaptation
Fabian Paischer, Lukas Hauzenberger, Thomas Schmied et al.
Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing
Yilmazcan Ozyurt, Tunaberk Almaci, Stefan Feuerriegel et al.
Progress Reward Model for Reinforcement Learning via Large Language Models
Xiuhui Zhang, Ning Gao, Xingyu Jiang et al.
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Mingjie Liu, Shizhe Diao, Ximing Lu et al.
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu, Tong Bu, Zecheng Hao et al.
Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
Zhenjie Yang, Xiaosong Jia, Qifeng Li et al.
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
Yiyang Zhou, Yangfan He, Yaofeng Su et al.
Real-World Reinforcement Learning of Active Perception Behaviors
Edward Hu, Jie Wang, Xingfang Yuan et al.
Reasoning as an Adaptive Defense for Safety
Taeyoun Kim, Fahim Tajwar, Aditi Raghunathan et al.
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang et al.
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Mingyang Chen, Linzhuang Sun, Tianpeng Li et al.
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Jorge (Zhoujun) Cheng, Shibo Hao, Tianyang Liu et al.
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan, Wencheng Han, xia zhou et al.
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim, Huiwon Jang, Sumin Park et al.
Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
Haozhen Zhang, Tao Feng, Jiaxuan You
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Yitian Chen, Jingfan Xia, Siyu Shao et al.
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
Junteng Liu, Yuanxiang Fan, Jiang Zhuo et al.
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT
Dongzhi JIANG, Ziyu Guo, Renrui Zhang et al.
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.
The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning
Xinyu Zhu, Mengzhou Xia, Zhepei Wei et al.
Uni-RL: Unifying Online and Offline RL via Implicit Value Regularization
Haoran Xu, Liyuan Mao, Hui Jin et al.
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
Ruilin Luo, Zhuofan Zheng, Lei Wang et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.