"preference feedback" Papers
3 papers found
Non-Stationary Dueling Bandits Under a Weighted Borda Criterion
Joe Suk, Arpit Agarwal
ICLR 2025posterarXiv:2403.12950
2
citations
Reward Learning from Multiple Feedback Types
Yannick Metz, Andras Geiszl, Raphaël Baur et al.
ICLR 2025posterarXiv:2502.21038
4
citations
Coactive Learning for Large Language Models using Implicit User Feedback
Aaron D. Tucker, Kianté Brantley, Adam Cahall et al.
ICML 2024poster