Poster by Pengle Zhang Papers
2 papers found
SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization
Jintao Zhang, Haofeng Huang, Pengle Zhang et al.
ICML 2025posterarXiv:2411.10958
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration
Jintao Zhang, Jia wei, Pengle Zhang et al.
ICLR 2025posterarXiv:2410.02367