Most Cited 2025 Highlight by ZIJIA CHEN Papers
3 papers found
Conference
#1
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong, Yonggan Fu, Shizhe Diao et al.
ICLR 2025posterarXiv:2411.13676
55
citations
#2
Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training
Shizhe Diao, Yu Yang, Yonggan Fu et al.
NEURIPS 2025spotlightarXiv:2504.13161
20
citations
#3
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning
Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan et al.
NEURIPS 2025posterarXiv:2504.11409
6
citations