ICML 2024 "reinforcement learning" Papers
74 papers found • Page 1 of 2
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Yen-Ju Chen, Nai-Chieh Huang, Ching-pei Lee et al.
Activation-Descent Regularization for Input Optimization of ReLU Networks
Hongzhan Yu, Sicun Gao
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
Zhihai Wang, Jie Wang, Dongsheng Zuo et al.
A Neural-Guided Dynamic Symbolic Network for Exploring Mathematical Expressions from Data
Wenqiang Li, Weijun Li, Lina Yu et al.
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke, Zaiwen Wen, Junyu Zhang
An Information Theoretic Approach to Interaction-Grounded Learning
Xiaoyan Hu, Farzan Farnia, Ho-fung Leung
Augmenting Decision with Hypothesis in Reinforcement Learning
Nguyen Minh Quang, Hady Lauw
Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays
Qingyuan Wu, Simon Zhan, Yixuan Wang et al.
Code as Reward: Empowering Reinforcement Learning with VLMs
David Venuto, Mohammad Sami Nur Islam, Martin Klissarov et al.
Cross-Domain Policy Adaptation by Capturing Representation Mismatch
Jiafei Lyu, Chenjia Bai, Jing-Wen Yang et al.
Dealing With Unbounded Gradients in Stochastic Saddle-point Optimization
Gergely Neu, Nneka Okolo
DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation
Yinjun Wu, Mayank Keoliya, Kan Chen et al.
Efficient Policy Evaluation with Offline Data Informed Behavior Policy Design
Shuze Liu, Shangtong Zhang
Efficient World Models with Context-Aware Tokenization
Vincent Micheli, Eloi Alonso, François Fleuret
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang, Shaohuai Liu, Weirui Ye et al.
EvoRainbow: Combining Improvements in Evolutionary Reinforcement Learning for Policy Search
Pengyi Li, Yan Zheng, Hongyao Tang et al.
Fair Resource Allocation in Multi-Task Learning
Hao Ban, Kaiyi Ji
Feedback Efficient Online Fine-Tuning of Diffusion Models
Masatoshi Uehara, Yulai Zhao, Kevin Black et al.
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Maciej Wołczyk, Bartłomiej Cupiał, Mateusz Ostaszewski et al.
Finite Time Logarithmic Regret Bounds for Self-Tuning Regulation
Rahul Singh, Akshay Mete, Avik Kar et al.
Fourier Controller Networks for Real-Time Decision-Making in Embodied Learning
Hengkai Tan, LIU SONGMING, Kai Ma et al.
FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Yuwei Fu, Haichao Zhang, di wu et al.
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Paul Mattes, Rainer Schlosser, Ralf Herbrich
Implicit Bias of Policy Gradient in Linear Quadratic Control: Extrapolation to Unseen Initial States
Noam Razin, Yotam Alexander, Edo Cohen-Karlik et al.
Improving Token-Based World Models with Parallel Observation Prediction
Lior Cohen, Kaixin Wang, Bingyi Kang et al.
Iterative Regularized Policy Optimization with Imperfect Demonstrations
Xudong Gong, Feng Dawei, Kele Xu et al.
Knowledge-aware Reinforced Language Models for Protein Directed Evolution
Yuhao Wang, Qiang Zhang, Ming Qin et al.
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
Zelai Xu, Chao Yu, Fei Fang et al.
Learning Causal Dynamics Models in Object-Oriented Environments
Zhongwei Yu, Jingqing Ruan, Dengpeng Xing
Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making
Vivek Myers, Chongyi Zheng, Anca Dragan et al.
Learning the Target Network in Function Space
Kavosh Asadi, Yao Liu, Shoham Sabach et al.
Learning to Stabilize Online Reinforcement Learning in Unbounded State Spaces
Brahma Pavse, Matthew Zurek, Yudong Chen et al.
Linguistic Calibration of Long-Form Generations
Neil Band, Xuechen Li, Tengyu Ma et al.
LLM-Empowered State Representation for Reinforcement Learning
Boyuan Wang, Yun Qu, Yuhang Jiang et al.
Mixtures of Experts Unlock Parameter Scaling for Deep RL
Johan Obando Ceron, Ghada Sokar, Timon Willi et al.
Near-Optimal Regret in Linear MDPs with Aggregate Bandit Feedback
Asaf Cassel, Haipeng Luo, Aviv Rosenberg et al.
No-Regret Reinforcement Learning in Smooth MDPs
Davide Maran, Alberto Maria Metelli, Matteo Papini et al.
OMPO: A Unified Framework for RL under Policy and Dynamics Shifts
Yu Luo, Tianying Ji, Fuchun Sun et al.
On the Unexpected Effectiveness of Reinforcement Learning for Sequential Recommendation
Álvaro Labarca Silva, Denis Parra, Rodrigo A Toro Icarte
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
Chengjie Wu, Hao Hu, yiqin yang et al.
Policy-conditioned Environment Models are More Generalizable
Ruifeng Chen, Xiong-Hui Chen, Yihao Sun et al.
Position: Social Environment Design Should be Further Developed for AI-based Policy-Making
Edwin Zhang, Sadie Zhao, Tonghan Wang et al.
Principled Penalty-based Methods for Bilevel Reinforcement Learning and RLHF
Han Shen, Zhuoran Yang, Tianyi Chen
Probabilistic Constrained Reinforcement Learning with Formal Interpretability
YANRAN WANG, QIUCHEN QIAN, David Boyle
Provably Efficient Reinforcement Learning for Adversarial Restless Multi-Armed Bandits with Unknown Transitions and Bandit Feedback
GUOJUN XIONG, Jian Li
Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent
Yingru Li, Jiawei Xu, Lei Han et al.
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design
Quan Nguyen, Adji Bousso Dieng
Random Latent Exploration for Deep Reinforcement Learning
Srinath Mahankali, Zhang-Wei Hong, Ayush Sekhari et al.
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber, Ana Busic, Jiamin ZHU
Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance
Jakub Svoboda, Suguman Bansal, Krishnendu Chatterjee