NeurIPS 2025 "speculative decoding" Papers
9 papers found
Approximately Aligned Decoding
Daniel Melcer, Sujan Kumar Gonugondla, Pramuditha Perera et al.
NeurIPS 2025posterarXiv:2410.01103
2
citations
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization
Yize Wu, KE GAO, Ling Li et al.
NeurIPS 2025posterarXiv:2502.02493
1
citations
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
Shijing Hu, Jingyang Li, Xingyu Xie et al.
NeurIPS 2025posterarXiv:2502.11018
3
citations
SpecEM: Training-Free LLM Ensembling via Iterative Drafting, Verification, and Online Feedback
Bo Lv, Nayu Liu, Chen Tang et al.
NeurIPS 2025poster
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning
Rui Pan, Yinwei Dai, Zhihao Zhang et al.
NeurIPS 2025posterarXiv:2504.07891
35
citations
Speculative Jacobi-Denoising Decoding for Accelerating Autoregressive Text-to-image Generation
Yao Teng, Fu-Yun Wang, Xian Liu et al.
NeurIPS 2025posterarXiv:2510.08994
Towards Better & Faster Autoregressive Image Generation: From the Perspective of Entropy
Xiaoxiao Ma, Feng Zhao, Pengyang Ling et al.
NeurIPS 2025posterarXiv:2510.09012
3
citations
TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding
Shukai Gong, YIYANG FU, Fengyuan Ran et al.
NeurIPS 2025oralarXiv:2507.09252
ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding
Jialiang Kang, Han Shu, Wenshuo Li et al.
NeurIPS 2025posterarXiv:2509.15235
2
citations