NEURIPS Spotlight "inference acceleration" Papers
2 papers found
LeMiCa: Lexicographic Minimax Path Caching for Efficient Diffusion-Based Video Generation
Huanlin Gao, Ping Chen, Fuyuan Shi et al.
NEURIPS 2025spotlightarXiv:2511.00090
1
citations
TransMLA: Migrating GQA Models to MLA with Full DeepSeek Compatibility and Speedup
Fanxu Meng, Pingzhi Tang, Zengwei Yao et al.
NEURIPS 2025spotlight