Poster "mathematical reasoning" Papers

51 papers found • Page 1 of 2

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.

NEURIPS 2025arXiv:2505.20686
12
citations

Advancing LLM Reasoning Generalists with Preference Trees

Lifan Yuan, Ganqu Cui, Hanbin Wang et al.

ICLR 2025arXiv:2404.02078
183
citations

Advancing Mathematical Reasoning in Language Models: The Impact of Problem-Solving Data, Data Synthesis Methods, and Training Stages

Zui Chen, Tianqiao Liu, Tongqing et al.

ICLR 2025arXiv:2501.14002
12
citations

Analyzing the Power of Chain of Thought through Memorization Capabilities

Lijia Yu, Xiao-Shan Gao, Lijun Zhang

NEURIPS 2025arXiv:2511.01190

Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models

Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.

ICLR 2025arXiv:2410.18252
43
citations

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning

Jiayu Wang, Yifei Ming, Zixuan Ke et al.

NEURIPS 2025arXiv:2506.04723
2
citations

Can LLMs Solve Longer Math Word Problems Better?

Xin Xu, Tong Xiao, Zitong Chao et al.

ICLR 2025arXiv:2405.14804
26
citations

Chain of Execution Supervision Promotes General Reasoning in Large Language Models

Nuo Chen, Zehua Li, Keqin Bao et al.

NEURIPS 2025arXiv:2510.23629

ChatVLA-2: Vision-Language-Action Model with Open-World Reasoning

Zhongyi Zhou, Yichen Zhu, Xiaoyu Liu et al.

NEURIPS 2025

CLAWS:Creativity detection for LLM-generated solutions using Attention Window of Sections

Keuntae Kim, Eunhye Jeong, Sehyeon Lee et al.

NEURIPS 2025

Conformal Language Model Reasoning with Coherent Factuality

Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji et al.

ICLR 2025arXiv:2505.17126
6
citations

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization

Gang Li, Ming Lin, Tomer Galanti et al.

NEURIPS 2025arXiv:2505.12366
12
citations

Don’t Think Longer, Think Wisely: Optimizing Thinking Dynamics for Large Reasoning Models

Sohyun An, Ruochen Wang, Tianyi Zhou et al.

NEURIPS 2025
11
citations

Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling

Xinglin Wang, Yiwei Li, Shaoxiong Feng et al.

NEURIPS 2025arXiv:2506.15707
5
citations

GRIP: A Graph-Based Reasoning Instruction Producer

Jiankang Wang, Jianjun Xu, Xiaorui Wang et al.

NEURIPS 2025arXiv:2412.08864
2
citations

Know What You Don't Know: Uncertainty Calibration of Process Reward Models

Young-Jin Park, Kristjan Greenewald, Kaveh Alimohammadi et al.

NEURIPS 2025arXiv:2506.09338
4
citations

Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj et al.

ICLR 2025arXiv:2410.01335
15
citations

LeanAgent: Lifelong Learning for Formal Theorem Proving

Adarsh Kumarappan, Mohit Tiwari, Peiyang Song et al.

ICLR 2025arXiv:2410.06209
12
citations

Learning to Focus: Causal Attention Distillation via Gradient‐Guided Token Pruning

Yiju Guo, Wenkai Yang, Zexu Sun et al.

NEURIPS 2025arXiv:2506.07851
4
citations

Lookahead Routing for Large Language Models

Canbin Huang, Tianyuan Shi, Yuhua Zhu et al.

NEURIPS 2025arXiv:2510.19506

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025arXiv:2410.08196
28
citations

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Yicheng Xiao, Lin Song, Yukang Chen et al.

NEURIPS 2025arXiv:2505.13031
20
citations

Mixture of Inputs: Text Generation Beyond Discrete Token Sampling

Yufan Zhuang, Liyuan Liu, Chandan Singh et al.

NEURIPS 2025

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver

Zhenting Qi, Mingyuan MA, Jiahang Xu et al.

ICLR 2025arXiv:2408.06195
129
citations

OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

Yiyou Sun, Shawn Hu, Georgia Zhou et al.

NEURIPS 2025arXiv:2506.18880
32
citations

On Extending Direct Preference Optimization to Accommodate Ties

Jinghong Chen, Guangyu Yang, Weizhe Lin et al.

NEURIPS 2025arXiv:2409.17431
7
citations

OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling

Zhicheng YANG, Yiwei Wang, Yinya Huang et al.

ICLR 2025arXiv:2407.09887
31
citations

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Jiarui Yao, Yifan Hao, Hanning Zhang et al.

NEURIPS 2025arXiv:2505.02391
13
citations

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Tian Ye, Zicheng Xu, Yuanzhi Li et al.

ICLR 2025arXiv:2407.20311
100
citations

Preference Optimization for Reasoning with Pseudo Feedback

Fangkai Jiao, Geyang Guo, Xingxing Zhang et al.

ICLR 2025arXiv:2411.16345
35
citations

Progress or Regress? Self-Improvement Reversal in Post-training

Ting Wu, Xuefeng Li, Pengfei Liu

ICLR 2025arXiv:2407.05013
19
citations

Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning

Zenan Li, Zhaoyu Li, Wen Tang et al.

ICLR 2025arXiv:2502.13834
14
citations

RaSA: Rank-Sharing Low-Rank Adaptation

Zhiwei He, Zhaopeng Tu, Xing Wang et al.

ICLR 2025arXiv:2503.12576
5
citations

RAST: Reasoning Activation in LLMs via Small-model Transfer

Siru Ouyang, Xinyu Zhu, Zilin Xiao et al.

NEURIPS 2025arXiv:2506.15710
2
citations

ReMA: Learning to Meta-Think for LLMs with Multi-agent Reinforcement Learning

Ziyu Wan, Yunxiang Li, Xiaoyu Wen et al.

NEURIPS 2025arXiv:2503.09501
40
citations

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Jiaqi Chen, Bang Zhang, Ruotian Ma et al.

NEURIPS 2025arXiv:2504.19162
23
citations

SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

Ling Yang, Zhaochen Yu, Tianjun Zhang et al.

ICLR 2025arXiv:2410.09008
15
citations

Teaching Language Models to Reason with Tools

Chengpeng Li, Zhengyang Tang, Ziniu Li et al.

NEURIPS 2025arXiv:2510.20342
2
citations

The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning

Xinyu Zhu, Mengzhou Xia, Zhepei Wei et al.

NEURIPS 2025arXiv:2506.01347
89
citations

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Zayne Sprague, Fangcong Yin, Juan Rodriguez et al.

ICLR 2025arXiv:2409.12183
250
citations

Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning

Wenkai Yang, Shuming Ma, Yankai Lin et al.

NEURIPS 2025arXiv:2502.18080
103
citations

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Xiaoyuan Liu, Tian Liang, Zhiwei He et al.

NEURIPS 2025arXiv:2505.13445
18
citations

UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models

Xin Xu, Jiaxin ZHANG, Tianhao Chen et al.

ICLR 2025arXiv:2501.13766
14
citations

VinePPO: Refining Credit Assignment in RL Training of LLMs

Amirhossein Kazemnejad, Milad Aghajohari, Eva Portelance et al.

ICML 2025arXiv:2410.01679
56
citations

VLRMBench: A Comprehensive and Challenging Benchmark for Vision-Language Reward Models

JIACHENG RUAN, Wenzhen Yuan, Xian Gao et al.

ICCV 2025arXiv:2503.07478
15
citations

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
655
citations

Improving Factuality and Reasoning in Language Models through Multiagent Debate

Yilun Du, Shuang Li, Antonio Torralba et al.

ICML 2024arXiv:2305.14325
1274
citations

In-Context Principle Learning from Mistakes

Tianjun Zhang, Aman Madaan, Luyu Gao et al.

ICML 2024arXiv:2402.05403
40
citations

Interpreting and Improving Large Language Models in Arithmetic Calculation

Wei Zhang, Wan Chaoqun, Yonggang Zhang et al.

ICML 2024arXiv:2409.01659
42
citations

MathScale: Scaling Instruction Tuning for Mathematical Reasoning

Zhengyang Tang, Xingxing Zhang, Benyou Wang et al.

ICML 2024arXiv:2403.02884
146
citations
PreviousNext