"tensor parallelism" Papers
2 papers found
EasySpec: Layer-Parallel Speculative Decoding for Efficient Multi-GPU Utilization
Yize Wu, KE GAO, Ling Li et al.
NEURIPS 2025posterarXiv:2502.02493
1
citations
First Attentions Last: Better Exploiting First Attentions for Efficient Parallel Training
Gyudong Kim, Hyukju Na, Jin Kim et al.
NEURIPS 2025poster