ICML "policy gradient methods" Papers
15 papers found
Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces
Ziyi Chen, Heng Huang
ICML 2024poster
Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning
Yen-Ju Chen, Nai-Chieh Huang, Ching-pei Lee et al.
ICML 2024posterarXiv:2310.11897
Do Transformer World Models Give Better Policy Gradients?
Michel Ma, Tianwei Ni, Clement Gehring et al.
ICML 2024posterarXiv:2402.05290
GFlowNet Training by Policy Gradients
Puhua Niu, Shili Wu, Mingzhou Fan et al.
ICML 2024poster
How to Explore with Belief: State Entropy Maximization in POMDPs
Riccardo Zamboni, Duilio Cirino, Marcello Restelli et al.
ICML 2024posterarXiv:2406.02295
Learning Optimal Deterministic Policies with Stochastic Policy Gradients
Alessandro Montenegro, Marco Mussi, Alberto Maria Metelli et al.
ICML 2024spotlightarXiv:2405.02235
Major-Minor Mean Field Multi-Agent Reinforcement Learning
Kai Cui, Christian Fabian, Anam Tahir et al.
ICML 2024poster
Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning
Kakei Yamamoto, Kazusato Oko, Zhuoran Yang et al.
ICML 2024oral
Mollification Effects of Policy Gradient Methods
Tao Wang, Sylvia Herbert, Sicun Gao
ICML 2024poster
Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation
Yudan Wang, Yue Wang, Yi Zhou et al.
ICML 2024oralarXiv:2406.01762
Optimistic Multi-Agent Policy Gradient
Wenshuai Zhao, Yi Zhao, Zhiyuan Li et al.
ICML 2024posterarXiv:2311.01953
Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient
Ju-Hyun Kim, Seungki Min
ICML 2024poster
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla, Ananye Agarwal, Deepak Pathak
ICML 2024posterarXiv:2407.20230
Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process
Xiangxin Zhou, Liang Wang, Yichi Zhou
ICML 2024posterarXiv:2403.04154
Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles
Bhrij Patel, Wesley A. Suttle, Alec Koppel et al.
ICML 2024posterarXiv:2403.11925