Mixture of Inputs: Text Generation Beyond Discrete Token Sampling

0citations

Citations

#1334

in NeurIPS 2025

of 5858 papers

Authors

Data Points

Authors

Yufan Zhuang Liyuan Liu Chandan Singh Jingbo Shang Jianfeng Gao

Topics

autoregressive generation token distribution preservation bayesian estimation text generation mathematical reasoning code generation phd-level qa continuous model input

Abstract

In standard autoregressive generation, an LLM predicts the next-token distribution, samples a discrete token, and then discards the distribution, passing only the sampled token as new input. To preserve this distribution’s rich information, we propose Mixture of Inputs (MoI), a training-free method for autoregressive generation. After generating a token following the standard paradigm, we construct a new input that blends the generated discrete token with the previously discarded token distribution. Specifically, we employ a Bayesian estimation method that treats the token distribution as the prior, the sampled token as the observation, and replaces the conventional one-hot vector with the continuous posterior expectation as the new model input. MoI allows the model to maintain a richer internal representation throughout the generation process, resulting in improved text quality and reasoning capabilities. On mathematical reasoning, code generation, and PhD-level QA tasks, MoI consistently improves performance across multiple models including QwQ-32B, Nemotron-Super-49B, Gemma-3-27B, and DAPO-Qwen-32B, with no additional training and negligible computational overhead.

Citation History

Jan 25, 2026

Jan 27, 2026

Jan 31, 2026