"reinforcement learning fine-tuning" Papers

12 papers found

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

Haonan Han, Xiangzuo Wu, Huan Liao et al.

CVPR 2025posterarXiv:2411.18654
5
citations

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Zhiyuan Zhou, Andy Peng, Qiyang Li et al.

ICLR 2025posterarXiv:2412.07762
27
citations

Moral Alignment for LLM Agents

Elizaveta Tennant, Stephen Hailes, Mirco Musolesi

ICLR 2025oralarXiv:2410.01639
25
citations

Regulatory DNA Sequence Design with Reinforcement Learning

Zhao Yang, Bing Su, Chuan Cao et al.

ICLR 2025posterarXiv:2503.07981
3
citations

ReinFlow: Fine-tuning Flow Matching Policy with Online Reinforcement Learning

Tonghe Zhang, Chao Yu, Sichang Su et al.

NeurIPS 2025posterarXiv:2505.22094
13
citations

SceneDesigner: Controllable Multi-Object Image Generation with 9-DoF Pose Manipulation

Zhenyuan Qin, Xincheng Shuai, Henghui Ding

NeurIPS 2025spotlightarXiv:2511.16666
1
citations

ShiQ: Bringing back Bellman to LLMs

Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.

NeurIPS 2025posterarXiv:2505.11081
1
citations

Teaching Language Models to Reason with Tools

Chengpeng Li, Zhengyang Tang, Ziniu Li et al.

NeurIPS 2025posterarXiv:2510.20342
2
citations

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding

Zongxia Li, Xiyang Wu, Guangyao Shi et al.

NeurIPS 2025posterarXiv:2505.01481
13
citations

Degeneration-free Policy Optimization: RL Fine-Tuning for Language Models without Degeneration

Youngsoo Jang, Geon-Hyeong Kim, Byoungjip Kim et al.

ICML 2024poster

Improving Agent Behaviors with RL Fine-tuning for Autonomous Driving

Zhenghao Peng, Wenjie Luo, Yiren Lu et al.

ECCV 2024posterarXiv:2409.18343
23
citations

Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment

Rui Yang, Xiaoman Pan, Feng Luo et al.

ICML 2024poster