2025 "transformer models" Papers

14 papers found

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040
6
citations

Can In-context Learning Really Generalize to Out-of-distribution Tasks?

Qixun Wang, Yifei Wang, Xianghua Ying et al.

ICLR 2025posterarXiv:2410.09695
15
citations

Geometry of Decision Making in Language Models

Abhinav Joshi, Divyanshu Bhatt, Ashutosh Modi

NEURIPS 2025posterarXiv:2511.20315

Learning Randomized Algorithms with Transformers

Johannes von Oswald, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2408.10818
1
citations

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.

CVPR 2025posterarXiv:2503.18314
8
citations

Mixture of Parrots: Experts improve memorization more than reasoning

Samy Jelassi, Clara Mohri, David Brandfonbrener et al.

ICLR 2025posterarXiv:2410.19034
14
citations

Multi-modal brain encoding models for multi-modal stimuli

SUBBA REDDY OOTA, Khushbu Pahwa, mounika marreddy et al.

ICLR 2025posterarXiv:2505.20027
9
citations

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025posterarXiv:2405.16414
5
citations

SelectFormer in Data Markets: Privacy-Preserving and Efficient Data Selection for Transformers with Multi-Party Computation

Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji

ICLR 2025poster

Self-Verifying Reflection Helps Transformers with CoT Reasoning

Zhongwei Yu, Wannian Xia, Xue Yan et al.

NEURIPS 2025posterarXiv:2510.12157
1
citations

StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs

Qijun Luo, Mengqi Li, Lei Zhao et al.

NEURIPS 2025posterarXiv:2506.03077
1
citations

Toward Understanding In-context vs. In-weight Learning

Bryan Chan, Xinyi Chen, Andras Gyorgy et al.

ICLR 2025posterarXiv:2410.23042
14
citations

TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding

Shukai Gong, YIYANG FU, Fengyuan Ran et al.

NEURIPS 2025oralarXiv:2507.09252

Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity

Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

ICLR 2025posterarXiv:2406.14479
16
citations