2025 "communication overhead" Papers
5 papers found
Accelerating Parallel Diffusion Model Serving with Residual Compression
Jiajun Luo, Yicheng Xiao, Jianru Xu et al.
NEURIPS 2025oralarXiv:2507.17511
Block-Diagonal LoRA for Eliminating Communication Overhead in Tensor Parallel LoRA Serving
Xinyu Wang, Jonas M. Kübler, Kailash Budhathoki et al.
NEURIPS 2025posterarXiv:2510.23346
Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism
Kunyun Wang, Bohan Li, Kai Yu et al.
NEURIPS 2025posterarXiv:2505.14741
1
citations
DUO: No Compromise to Accuracy Degradation
Jinda Jia, Cong Xie, Hanlin Lu et al.
NEURIPS 2025poster
First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
Gyudong Kim, Hyukju Na, Jin Kim et al.
NEURIPS 2025poster