by Fuxiang Zhang Papers
4 papers found
Incentivizing LLMs to Self-Verify Their Answers
Fuxiang Zhang, Jiacheng Xu, Chaojie Wang et al.
NeurIPS 2025poster
1
citations
Multi-Agent Imitation by Learning and Sampling from Factorized Soft Q-Function
Yi-Chen Li, Zhongxiang Ling, Tao Jiang et al.
NeurIPS 2025poster
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu et al.
ICLR 2025poster
5
citations
Policy Rehearsing: Training Generalizable Policies for Reinforcement Learning
Chengxing Jia, Chen-Xiao Gao, Hao Yin et al.
ICLR 2024poster