2025 "process reward models" Papers
9 papers found
DreamPRM: Domain-reweighted Process Reward Model for Multimodal Reasoning
Qi Cao, Ruiyi Wang, Ruiyi Zhang et al.
NEURIPS 2025posterarXiv:2505.20241
5
citations
Know What You Don't Know: Uncertainty Calibration of Process Reward Models
Young-Jin Park, Kristjan Greenewald, Kaveh Alimohammadi et al.
NEURIPS 2025posterarXiv:2506.09338
4
citations
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Jiaru Zou, Ling Yang, Jingwen Gu et al.
NEURIPS 2025posterarXiv:2506.18896
22
citations
Reasoning Is Not a Race: When Stopping Early Beats Going Deeper
Mohan Zhang, Jiaxuan Gao, Shusheng Xu et al.
NEURIPS 2025poster
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning
Jiaqi Chen, Bang Zhang, Ruotian Ma et al.
NEURIPS 2025posterarXiv:2504.19162
21
citations
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
Ruilin Luo, Zhuofan Zheng, Lei Wang et al.
NEURIPS 2025posterarXiv:2501.04686
29
citations
Value-Guided Search for Efficient Chain-of-Thought Reasoning
Kaiwen Wang, Jin Zhou, Jonathan Chang et al.
NEURIPS 2025posterarXiv:2505.17373
7
citations
VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models
JIACHENG RUAN, Wenzhen Yuan, Xian Gao et al.
ICCV 2025posterarXiv:2503.07478
15
citations
Web-Shepherd: Advancing PRMs for Reinforcing Web Agents
Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.
NEURIPS 2025spotlightarXiv:2505.15277
8
citations