Rafael Rafailov
6
Papers
57
Total Citations
Papers (6)
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?
NeurIPS 2025arXiv
57
citations
Diffusion Model Alignment Using Direct Preference Optimization
CVPR 2024
0
citations
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
ICML 2024
0
citations
Visual Adversarial Imitation Learning using Variational Models
NeurIPS 2021
0
citations
COMBO: Conservative Offline Model-Based Policy Optimization
NeurIPS 2021
0
citations
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
NeurIPS 2023
0
citations