by Chetan Bansal Papers
4 papers found
AMPO: Active Multi Preference Optimization for Self-play Preference Selection
Taneesh Gupta, Rahul Madhavan, Xuchao Zhang et al.
ICML 2025poster
Anyprefer: An Agentic Framework for Preference Data Synthesis
Yiyang Zhou, Zhaoyang Wang, Tianle Wang et al.
ICLR 2025poster
CREAM: Consistency Regularized Self-Rewarding Language Models
Zhaoyang Wang, Weilei He, Zhiyuan Liang et al.
ICLR 2025poster
Generative Caching for Structurally Similar Prompts and Responses
Sarthak Chakraborty, Suman Nath, Xuchao Zhang et al.
NeurIPS 2025poster