NeurIPS 2025 "gemm kernel optimization" Papers
2 papers found
CodeGEMM: A Codebook-Centric Approach to Efficient GEMM in Quantized LLMs
Gunho Park, Jeongin Bae, Byeongwook Kim et al.
NeurIPS 2025posterarXiv:2512.17970
NestedFP: High-Performance, Memory-Efficient Dual-Precision Floating Point Support for LLMs
Haeun Lee, Omin Kwon, Yeonhong Park et al.
NeurIPS 2025posterarXiv:2506.02024
1
citations