"reinforcement learning" Papers
206 papers found • Page 4 of 5
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers, Chongyi Zheng, Anca Dragan et al.
Learning the Target Network in Function Space
Kavosh Asadi, Yao Liu, Shoham Sabach et al.
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Brahma Pavse, Matthew Zurek, Yudong Chen et al.
Learning Uncertainty-Aware Temporally-Extended Actions
Joongkyu Lee, Seung Joon Park, Yunhao Tang et al.
Linguistic Calibration of Long-Form Generations
Neil Band, Xuechen Li, Tengyu Ma et al.
LLM-Empowered State Representation for Reinforcement Learning
Boyuan Wang, Yun Qu, Yuhang Jiang et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
Multimodal Label Relevance Ranking via Reinforcement Learning
Taian Guo, Taolin Zhang, Haoqian Wu et al.
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf Cassel, Haipeng Luo, Aviv Rosenberg et al.
No-Regret Reinforcement Learning in Smooth MDPs
Davide Maran, Alberto Maria Metelli, Matteo Papini et al.
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu Luo, Tianying Ji, Fuchun Sun et al.
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
Álvaro Labarca Silva, Denis Parra, Rodrigo A Toro Icarte
OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments
Jinyi Liu, Zhi Wang, Yan Zheng et al.
Parameterized Projected Bellman Operator
Théo Vincent, Alberto Maria Metelli, Boris Belousov et al.
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Chengjie Wu, Hao Hu, yiqin yang et al.
Policy-conditioned Environment Models are More Generalizable
Ruifeng Chen, Xiong-Hui Chen, Yihao Sun et al.
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
Edwin Zhang, Sadie Zhao, Tonghan Wang et al.
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen, Zhuoran Yang, Tianyi Chen
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
YANRAN WANG, QIUCHEN QIAN, David Boyle
Prompt to Transfer: Sim-to-Real Transfer for Traffic Signal Control with Prompt Learning
Longchao Da, Minquan Gao, Hua Wei et al.
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
GUOJUN XIONG, Jian Li
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li, Jiawei Xu, Lei Han et al.
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Quan Nguyen, Adji Bousso Dieng
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali, Zhang-Wei Hong, Ayush Sekhari et al.
Rating-Based Reinforcement Learning
Devin White, Mingkang Wu, Ellen Novoseller et al.
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber, Ana Busic, Jiamin ZHU
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
Jakub Svoboda, Suguman Bansal, Krishnendu Chatterjee
Reinforcement Learning within Tree Search for Fast Macro Placement
Zijie Geng, Jie Wang, Ziyan Liu et al.
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making
Parand A. Alamdari, Toryn Q. Klassen, Elliot Creager et al.
Rethinking Transformers in Solving POMDPs
Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.
Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning
Mohamed Elsayed, Homayoon Farrahi, Felix Dangel et al.
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
Haozhe Ma, Kuankuan Sima, Thanh Vinh Vo et al.
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.
RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Zelei Cheng, Xian Wu, Jiahao Yu et al.
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Yuda Song, Lili Wu, Dylan Foster et al.
Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient
Ju-Hyun Kim, Seungki Min
RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning
Boning Li, Zhixuan Fang, Longbo Huang
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Yufei Wang, Zhou Xian, Feng Chen et al.
Robust Optimization in Protein Fitness Landscapes Using Reinforcement Learning in Latent Space
Minji Lee, Luiz Felipe Vecchietti, Hyunkyu Jung et al.
Run-Time Task Composition with Safety Semantics
Kevin Leahy, Makai Mann, Zachary Serlin
Sample Average Approximation for Conditional Stochastic Optimization with Dependent Data
Yafei Wang, Bo Pan, Mei Li et al.
Sample Efficient Reinforcement Learning with Partial Dynamics Knowledge
Meshal Alharbi, Mardavij Roozbehani, Munther Dahleh
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
Matthias Weissenbacher, Rishabh Agarwal, Yoshinobu Kawahara
Stochastic Q-learning for Large Discrete Action Spaces
Fares Fourati, Vaneet Aggarwal, Mohamed-Slim Alouini
Successor Features for Efficient Multi-Subject Controlled Text Generation
Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung et al.
Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)
Qifeng Li, Xiaosong Jia, Shaobo Wang et al.
To the Max: Reinventing Reward in Reinforcement Learning
Grigorii Veviurko, Wendelin Boehmer, Mathijs de Weerdt
Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Haoran Li, Zicheng Zhang, Wang Luo et al.
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Zhiheng Xi, Wenxiang Chen, Boyang Hong et al.
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Ganqu Cui, Lifan Yuan, Ning Ding et al.