"reinforcement learning fine-tuning" Papers
12 papers found
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Haonan Han, Xiangzuo Wu, Huan Liao et al.
CVPR 2025posterarXiv:2411.18654
5
citations
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li et al.
ICLR 2025posterarXiv:2412.07762
27
citations
Moral Alignment for LLM Agents
Elizaveta Tennant, Stephen Hailes, Mirco Musolesi
ICLR 2025oralarXiv:2410.01639
25
citations
Regulatory DNA Sequence Design with Reinforcement Learning
Zhao Yang, Bing Su, Chuan Cao et al.
ICLR 2025posterarXiv:2503.07981
3
citations
ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning
Tonghe Zhang, Chao Yu, Sichang Su et al.
NeurIPS 2025posterarXiv:2505.22094
13
citations
SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation
Zhenyuan Qin, Xincheng Shuai, Henghui Ding
NeurIPS 2025spotlightarXiv:2511.16666
1
citations
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.
NeurIPS 2025posterarXiv:2505.11081
1
citations
Teaching Language Models to Reason with Tools
Chengpeng Li, Zhengyang Tang, Ziniu Li et al.
NeurIPS 2025posterarXiv:2510.20342
2
citations
VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding
Zongxia Li, Xiyang Wu, Guangyao Shi et al.
NeurIPS 2025posterarXiv:2505.01481
13
citations
Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration
Youngsoo Jang, Geon-Hyeong Kim, Byoungjip Kim et al.
ICML 2024poster
Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving
Zhenghao Peng, Wenjie Luo, Yiren Lu et al.
ECCV 2024posterarXiv:2409.18343
23
citations
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang, Xiaoman Pan, Feng Luo et al.
ICML 2024poster