ICLR "rotary position embedding" Papers
3 papers found
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Linda He, Jue Wang, Maurice Weber et al.
ICLR 2025posterarXiv:2504.12637
2
citations
Wavelet-based Positional Representation for Long Context
Yui Oka, Taku Hasegawa, Kyosuke Nishida et al.
ICLR 2025posterarXiv:2502.02004
2
citations
Why RoPE Struggles to Maintain Long-Term Decay in Long Sequences?
Wei Shen, Chao Yin, Yuliang Liu et al.
ICLR 2025poster