ICML "policy gradient methods" Papers

15 papers found

Filters:ICML policy gradient methods Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

Accelerated Policy Gradient for s-rectangular Robust MDPs with Large State Spaces

Ziyi Chen, Heng Huang

ICML 2024poster

Accelerated Policy Gradient: On the Convergence Rates of the Nesterov Momentum for Reinforcement Learning

Yen-Ju Chen, Nai-Chieh Huang, Ching-pei Lee et al.

ICML 2024posterarXiv:2310.11897

Do Transformer World Models Give Better Policy Gradients?

Michel Ma, Tianwei Ni, Clement Gehring et al.

ICML 2024posterarXiv:2402.05290

GFlowNet Training by Policy Gradients

Puhua Niu, Shili Wu, Mingzhou Fan et al.

ICML 2024poster

How to Explore with Belief: State Entropy Maximization in POMDPs

Riccardo Zamboni, Duilio Cirino, Marcello Restelli et al.

ICML 2024posterarXiv:2406.02295

Learning Optimal Deterministic Policies with Stochastic Policy Gradients

Alessandro Montenegro, Marco Mussi, Alberto Maria Metelli et al.

ICML 2024spotlightarXiv:2405.02235

Major-Minor Mean Field Multi-Agent Reinforcement Learning

Kai Cui, Christian Fabian, Anam Tahir et al.

ICML 2024poster

Mean Field Langevin Actor-Critic: Faster Convergence and Global Optimality beyond Lazy Learning

Kakei Yamamoto, Kazusato Oko, Zhuoran Yang et al.

ICML 2024oral

Mollification Effects of Policy Gradient Methods

Tao Wang, Sylvia Herbert, Sicun Gao

ICML 2024poster

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

Yudan Wang, Yue Wang, Yi Zhou et al.

ICML 2024oralarXiv:2406.01762

Optimistic Multi-Agent Policy Gradient

Wenshuai Zhao, Yi Zhao, Zhiyuan Li et al.

ICML 2024posterarXiv:2311.01953

Risk-Sensitive Policy Optimization via Predictive CVaR Policy Gradient

Ju-Hyun Kim, Seungki Min

ICML 2024poster

SAPG: Split and Aggregate Policy Gradients

Jayesh Singla, Ananye Agarwal, Deepak Pathak

ICML 2024posterarXiv:2407.20230

Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process

Xiangxin Zhou, Liang Wang, Yichi Zhou

ICML 2024posterarXiv:2403.04154

Towards Global Optimality for Practical Average Reward Reinforcement Learning without Mixing Time Oracles

Bhrij Patel, Wesley A. Suttle, Alec Koppel et al.

ICML 2024posterarXiv:2403.11925