Poster "variance reduction techniques" Papers
3 papers found
Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition
Zhong Zheng, Haochen Zhang, Lingzhou Xue
ICLR 2025posterarXiv:2410.07574
9
citations
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang et al.
ICML 2024poster
Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models
Tanmay Gautam, Youngsuk Park, Hao Zhou et al.
ICML 2024poster