2025 "transformer models" Papers
14 papers found
A multiscale analysis of mean-field transformers in the moderate interaction regime
Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi
NEURIPS 2025oralarXiv:2509.25040
6
citations
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Qixun Wang, Yifei Wang, Xianghua Ying et al.
ICLR 2025posterarXiv:2410.09695
15
citations
Geometry of Decision Making in Language Models
Abhinav Joshi, Divyanshu Bhatt, Ashutosh Modi
NEURIPS 2025posterarXiv:2511.20315
Learning Randomized Algorithms with Transformers
Johannes von Oswald, Seijin Kobayashi, Yassir Akram et al.
ICLR 2025posterarXiv:2408.10818
1
citations
LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty
Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.
CVPR 2025posterarXiv:2503.18314
8
citations
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi, Clara Mohri, David Brandfonbrener et al.
ICLR 2025posterarXiv:2410.19034
14
citations
Multi-modal brain encoding models for multi-modal stimuli
SUBBA REDDY OOTA, Khushbu Pahwa, mounika marreddy et al.
ICLR 2025posterarXiv:2505.20027
9
citations
Robust Message Embedding via Attention Flow-Based Steganography
Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.
CVPR 2025posterarXiv:2405.16414
5
citations
SelectFormer in Data Markets: Privacy-Preserving and Efficient Data Selection for Transformers with Multi-Party Computation
Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji
ICLR 2025poster
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Zhongwei Yu, Wannian Xia, Xue Yan et al.
NEURIPS 2025posterarXiv:2510.12157
1
citations
StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs
Qijun Luo, Mengqi Li, Lei Zhao et al.
NEURIPS 2025posterarXiv:2506.03077
1
citations
Toward Understanding In-context vs. In-weight Learning
Bryan Chan, Xinyi Chen, Andras Gyorgy et al.
ICLR 2025posterarXiv:2410.23042
14
citations
TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding
Shukai Gong, YIYANG FU, Fengyuan Ran et al.
NEURIPS 2025oralarXiv:2507.09252
Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity
Jiachen Jiang, Jinxin Zhou, Zhihui Zhu
ICLR 2025posterarXiv:2406.14479
16
citations