Poster "reinforcement learning algorithms" Papers
4 papers found
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL
Jiarui Yao, Yifan Hao, Hanning Zhang et al.
NeurIPS 2025posterarXiv:2505.02391
11
citations
Position: Benchmarking is Limited in Reinforcement Learning Research
Scott Jordan, Adam White, Bruno da Silva et al.
ICML 2024poster
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla, Ananye Agarwal, Deepak Pathak
ICML 2024poster
Slow and Steady Wins the Race: Maintaining Plasticity with Hare and Tortoise Networks
Hojoon Lee, Hyeonseo Cho, Hyunseung Kim et al.
ICML 2024poster