ICLR "transformer-based models" Papers
3 papers found
From Promise to Practice: Realizing High-performance Decentralized Training
Zesen Wang, Jiaojiao Zhang, Xuyang Wu et al.
ICLR 2025posterarXiv:2410.11998
2
citations
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.
ICLR 2025posterarXiv:2404.15574
140
citations
SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting
Hui Chen, Viet Luong, Lopamudra Mukherjee et al.
ICLR 2025oral