NEURIPS 2025 "process supervision" Papers
2 papers found
Large language models can learn and generalize steganographic chain-of-thought under process supervision
ROBERT MC CARTHY, Joey SKAF, Luis Ibanez-Lissen et al.
NEURIPS 2025posterarXiv:2506.01926
11
citations
Unlocking Multimodal Mathematical Reasoning via Process Reward Model
Ruilin Luo, Zhuofan Zheng, Lei Wang et al.
NEURIPS 2025posterarXiv:2501.04686
29
citations