Poster by Xiaoqing Li Papers
2 papers found
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization
Zhijian Zhuo, Yutao Zeng, Ya Wang et al.
NeurIPS 2025poster
Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models
Zhijian Zhuo, Ya Wang, Yutao Zeng et al.
ICLR 2025posterarXiv:2411.03884
5
citations