ICLR Poster "computational efficiency" Papers
13 papers found
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Zijia Zhao, Longteng Guo, Jie Cheng et al.
ICLR 2025posterarXiv:2410.10456
8
citations
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.
ICLR 2025posterarXiv:2410.18252
39
citations
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation
Tobias Leemann, Periklis Petridis, Giuseppe Vietri et al.
ICLR 2025posterarXiv:2410.03461
3
citations
Context-aware Dynamic Pruning for Speech Foundation Models
Masao Someki, Yifan Peng, Siddhant Arora et al.
ICLR 2025poster
7
citations
DUALFormer: Dual Graph Transformer
Zhuo Jiaming, Yuwei Liu, Yintong Lu et al.
ICLR 2025poster
3
citations
Dynamic Diffusion Transformer
Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.
ICLR 2025posterarXiv:2410.03456
34
citations
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
ICLR 2025posterarXiv:2502.20766
51
citations
Gradient descent with generalized Newton’s method
Zhiqi Bu, Shiyun Xu
ICLR 2025posterarXiv:2407.02772
6
citations
P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS
Malyaban Bal, Abhronil Sengupta
ICLR 2025posterarXiv:2406.02923
11
citations
RegMix: Data Mixture as Regression for Language Model Pre-training
Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.
ICLR 2025posterarXiv:2407.01492
99
citations
Steering Large Language Models between Code Execution and Textual Reasoning
Yongchao Chen, Harsh Jhamtani, Srinagesh Sharma et al.
ICLR 2025posterarXiv:2410.03524
25
citations
The Journey Matters: Average Parameter Count over Pre-training Unifies Sparse and Dense Scaling Laws
Tian Jin, Ahmed Imtiaz Humayun, Utku Evci et al.
ICLR 2025posterarXiv:2501.12486
1
citations
Variational Bayesian Pseudo-Coreset
Hyungi Lee, Seungyoo Lee, Juho Lee
ICLR 2025posterarXiv:2502.21143