"text generation" Papers
19 papers found
Beyond Autoregression: Fast LLMs via Self-Distillation Through Time
Justin Deschenaux, Caglar Gulcehre
Chunk-Distilled Language Modeling
Yanhong Li, Karen Livescu, Jiawei Zhou
Concept Bottleneck Large Language Models
Chung-En Sun, Tuomas Oikarinen, Berk Ustun et al.
d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning
Siyan Zhao, Devaansh Gupta, Qinqing Zheng et al.
Fast Solvers for Discrete Diffusion Models: Theory and Applications of High-Order Algorithms
Yinuo Ren, Haoxuan Chen, Yuchen Zhu et al.
HaDeMiF: Hallucination Detection and Mitigation in Large Language Models
Xiaoling Zhou, Mingjie Zhang, Zhemg Lee et al.
Informed Correctors for Discrete Diffusion Models
Yixiu Zhao, Jiaxin Shi, Feng Chen et al.
Iterative Foundation Model Fine-Tuning on Multiple Rewards
Pouya M. Ghari, simone sciabola, Ye Wang
Mixture of Inputs: Text Generation Beyond Discrete Token Sampling
Yufan Zhuang, Liyuan Liu, Chandan Singh et al.
Optimal Control for Transformer Architectures: Enhancing Generalization, Robustness and Efficiency
Kelvin Kan, Xingjian Li, Benjamin Zhang et al.
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
Heming Xia, Yongqi Li, Jun Zhang et al.
Theoretical Benefit and Limitation of Diffusion Language Model
Guhao Feng, Yihan Geng, Jian Guan et al.
Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs
Minh Nguyen, Andrew Baker, Clement Neo et al.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.
A Touch, Vision, and Language Dataset for Multimodal Alignment
Letian Fu, Gaurav Datta, Huang Huang et al.
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
Aaron Lou, Chenlin Meng, Stefano Ermon
From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers
Muhammed Emrullah Ildiz, Yixiao HUANG, Yingcong Li et al.
Principled Gradient-Based MCMC for Conditional Sampling of Text
Li Du, Afra Amini, Lucas Torroba Hennigen et al.
Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Kaiwen Xue, Yuhao Zhou, Shen Nie et al.