"model scalability" Papers
4 papers found
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
Ziteng Wang, Jun Zhu, Jianfei Chen
ICLR 2025posterarXiv:2412.14711
28
citations
Towards Neural Scaling Laws for Time Series Foundation Models
Qingren Yao, Chao-Han Huck Yang, Renhe Jiang et al.
ICLR 2025posterarXiv:2410.12360
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
Peihao Wang, Ruisi Cai, Yuehao Wang et al.
ICLR 2025posterarXiv:2501.00658
7
citations
On the Embedding Collapse when Scaling up Recommendation Models
Xingzhuo Guo, Junwei Pan, Ximei Wang et al.
ICML 2024posterarXiv:2310.04400