"fp8 quantization" Papers
2 papers found
FALQON: Accelerating LoRA Fine-tuning with Low-Bit Floating-Point Arithmetic
Kanghyun Choi, Hyeyoon Lee, Sunjong Park et al.
NEURIPS 2025arXiv:2510.24061
FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion
Akide Liu, Zeyu Zhang, Zhexin Li et al.
NEURIPS 2025spotlightarXiv:2506.04648
8
citations