ICLR Poster "reasoning benchmarks" Papers
2 papers found
As Simple as Fine-tuning: LLM Alignment via Bidirectional Negative Feedback Loss
Xin Mao, Huimin Xu, Feng-Lin Li et al.
ICLR 2025posterarXiv:2410.04834
3
citations
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo, Qingfeng Sun, Can Xu et al.
ICLR 2025posterarXiv:2308.09583
637
citations