2025 "reinforcement learning" Papers
177 papers found • Page 3 of 4
OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning
Alexandre Oliveira, Katarina Dyreby, Francisco Caldas et al.
Parameter Efficient Fine-tuning via Explained Variance Adaptation
Fabian Paischer, Lukas Hauzenberger, Thomas Schmied et al.
Pareto Prompt Optimization
Guang Zhao, Byung-Jun Yoon, Gilchan Park et al.
Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems
Christian Walder, Deep Tejas Karkhanis
Periodic Skill Discovery
Jonghae Park, Daesol Cho, Jusuk Lee et al.
Personalized Exercise Recommendation with Semantically-Grounded Knowledge Tracing
Yilmazcan Ozyurt, Tunaberk Almaci, Stefan Feuerriegel et al.
Policy Gradient with Kernel Quadrature
Tetsuro Morimura, Satoshi Hayakawa
Preference Distillation via Value based Reinforcement Learning
Minchan Kwon, Junwon Ko, Kangil kim et al.
Progress Reward Model for Reinforcement Learning via Large Language Models
Xiuhui Zhang, Ning Gao, Xingyu Jiang et al.
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models
Mingjie Liu, Shizhe Diao, Ximing Lu et al.
Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control
Zijie Xu, Tong Bu, Zecheng Hao et al.
RAST: Reasoning Activation in LLMs via Small-model Transfer
Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.
Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)
Zhenjie Yang, Xiaosong Jia, Qifeng Li et al.
ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding
Yiyang Zhou, Yangfan He, Yaofeng Su et al.
Real-World Reinforcement Learning of Active Perception Behaviors
Edward Hu, Jie Wang, Xingfang Yuan et al.
Reasoning as an Adaptive Defense for Safety
Taeyoun Kim, Fahim Tajwar, Aditi Raghunathan et al.
Reducing the Probability of Undesirable Outputs in Language Models Using Probabilistic Inference
Stephen Zhao, Aidan Li, Rob Brekelmans et al.
Reinforced Active Learning for Large-Scale Virtual Screening with Learnable Policy Model
Yicong Chen, Jiahua Rao, Jiancong Xie et al.
Reinforcement Learning for Out-of-Distribution Reasoning in LLMs: An Empirical Study on Diagnosis-Related Group Coding
Hanyin Wang, Zhenbang Wu, Gururaj Kolar et al.
Reinforcement Learning from Imperfect Corrective Actions and Proxy Rewards
Zhaohui JIANG, Xuening Feng, Paul Weng et al.
Reinforcement Learning-Guided Data Selection via Redundancy Assessment
Suorong Yang, Peijia Li, Furao Shen et al.
Reinforcement learning with combinatorial actions for coupled restless bandits
Lily Xu, Bryan Wilder, Elias Khalil et al.
Reinforcement Learning with Imperfect Transition Predictions: A Bellman-Jensen Approach
Chenbei Lu, Zaiwei Chen, Tongxin Li et al.
Reinforcing the Diffusion Chain of Lateral Thought with Diffusion Language Models
Zemin Huang, Zhiyang Chen, Zijun Wang et al.
REM: A Scalable Reinforced Multi-Expert Framework for Multiplex Influence Maximization
Huyen Nguyen, Hieu Dam, Nguyen Hoang Khoi Do et al.
Rendering-Aware Reinforcement Learning for Vector Graphics Generation
Juan Rodriguez, Haotian Zhang, Abhay Puri et al.
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
Mingyang Chen, Linzhuang Sun, Tianpeng Li et al.
Retro-R1: LLM-based Agentic Retrosynthesis
Wei Liu, Jiangtao Feng, Hongli Yu et al.
Reverse Engineering Human Preferences with Reinforcement Learning
Lisa Alazraki, Yi-Chern Tan, Jon Ander Campos et al.
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Jorge (Zhoujun) Cheng, Shibo Hao, Tianyang Liu et al.
REvolve: Reward Evolution with Large Language Models using Human Feedback
RISHI HAZRA, Alkis Sygkounas, Andreas Persson et al.
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan, Wencheng Han, xia zhou et al.
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim, Huiwon Jang, Sumin Park et al.
Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning
Haozhen Zhang, Tao Feng, Jiaxuan You
Safety Representations for Safer Policy Learning
Kaustubh Mani, Vincent Mai, Charlie Gauthier et al.
SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
Jiaqi Huang, Zunnan Xu, Jun Zhou et al.
Scaling RL to Long Videos
Yukang Chen, Wei Huang, Baifeng Shi et al.
Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Zilyu Ye, Zhiyang Chen, Tiancheng Li et al.
Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models
Yiran Guo, Lijie Xu, Jie Liu et al.
Self-Challenging Language Model Agents
Yifei Zhou, Sergey Levine, Jason Weston et al.
Selftok-Zero: Reinforcement Learning for Visual Generation via Discrete and Autoregressive Visual Tokens
Bohan Wang, Mingze Zhou, Zhongqi Yue et al.
Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning
Tian-Shuo Liu, Xu-Hui Liu, Ruifeng Chen et al.
Sequential Attention-based Sampling for Histopathological Analysis
Tarun Gogisetty, Naman Malpani, Gugan Chandrashekhar Mallika Thoppe et al.
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
Shift Before You Learn: Enabling Low-Rank Representations in Reinforcement Learning
Bastien Dubail, Stefan Stojanovic, Alexandre Proutiere
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling
Yitian Chen, Jingfan Xia, Siyu Shao et al.
SORREL: Suboptimal-Demonstration-Guided Reinforcement Learning for Learning to Branch
Shengyu Feng, Yiming Yang
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
Jiaqi Chen, Bang Zhang, Ruotian Ma et al.
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Peixian Ma, Xialie Zhuang, Chengjin Xu et al.
Stabilizing Reinforcement Learning in Differentiable Multiphysics Simulation
Eliot Xing, Vernon Luk, Jean Oh