"language model efficiency" Papers
2 papers found
Jet-Nemotron: Efficient Language Model with Post Neural Architecture Search
Yuxian Gu, Qinghao Hu, Haocheng Xi et al.
NeurIPS 2025posterarXiv:2508.15884
15
citations
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
Mehul Damani, Idan Shenfeld, Andi Peng et al.
ICLR 2025posterarXiv:2410.04707
45
citations