by Shuohuan Wang Papers
3 papers found
MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions
Yekun Chai, Haoran Sun, Huang Fang et al.
ICLR 2025oralarXiv:2410.02743
Mixture of Hidden-Dimensions: Not All Hidden-States’ Dimensions are Needed in Transformer
Yilong Chen, Junyuan Shang, Zhenyu Zhang et al.
ICML 2025poster
Tool-Augmented Reward Modeling
Lei Li, Yekun Chai, Shuohuan Wang et al.
ICLR 2024spotlight