From Self-Check to Consensus: Bayesian Strategic Decoding in Large Language Models

0citations
0
Citations
#1334
in NeurIPS 2025
of 5858 papers
3
Authors
4
Data Points

Abstract

Large Language Models exhibit logical inconsistency across multi-turn inference processes, undermining correctness in complex inferential tasks. Challenges arise from ensuring that outputs align with both factual correctness and human intent. Approaches like single-agent reflection and multi-agent debate frequently prioritize consistency, but at the expense of accuracy. To address this problem, we propose a novel game-theoretic consensus mechanism that enables LLMs to self-check their outputs during the decoding stage of output generation. Our method models the decoding process as a multistage Bayesian Decoding Game, where strategic interactions dynamically converge to a consensus on the most reliable outputs without human feedback or additional training. Remarkably, our game design allows smaller models to outperform much larger models through game mechanisms (e.g., 78.1 LLaMA13B vs. 76.6 PaLM540B). As a model-agnostic method, our approach consistently improves even the latest models, enhancing DeepSeek-7B's performance on MMLU by 12.4%. Our framework effectively balances correctness and consistency, demonstrating that properly designed game-theoretic mechanisms can significantly enhance the self-verification capabilities of language models across various tasks and model architectures.

Citation History

Jan 25, 2026
0
Jan 27, 2026
0
Jan 27, 2026
0
Jan 31, 2026
0