"token-level routing" Papers
2 papers found
Conference
CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing
Wenhao Zheng, Yixiao Chen, Weitong Zhang et al.
COLM 2025paperarXiv:2502.01976
24
citations
HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts
Mengqi Liao, Wei Chen, Junfeng Shen et al.
ICLR 2025
8
citations