2025 "transformer models" Papers

14 papers found

Filters:2025 transformer models Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040

citations

Can In-context Learning Really Generalize to Out-of-distribution Tasks?

Qixun Wang, Yifei Wang, Xianghua Ying et al.

ICLR 2025posterarXiv:2410.09695

citations

Geometry of Decision Making in Language Models

Abhinav Joshi, Divyanshu Bhatt, Ashutosh Modi

NEURIPS 2025posterarXiv:2511.20315

Learning Randomized Algorithms with Transformers

Johannes von Oswald, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2408.10818

citations

LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty

Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.

CVPR 2025posterarXiv:2503.18314

citations

Mixture of Parrots: Experts improve memorization more than reasoning

Samy Jelassi, Clara Mohri, David Brandfonbrener et al.

ICLR 2025posterarXiv:2410.19034

citations

Multi-modal brain encoding models for multi-modal stimuli

SUBBA REDDY OOTA, Khushbu Pahwa, mounika marreddy et al.

ICLR 2025posterarXiv:2505.20027

citations

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025posterarXiv:2405.16414

citations

SelectFormer in Data Markets: Privacy-Preserving and Efficient Data Selection for Transformers with Multi-Party Computation

Xu Ouyang, Felix Xiaozhu Lin, Yangfeng Ji

ICLR 2025poster

Self-Verifying Reflection Helps Transformers with CoT Reasoning

Zhongwei Yu, Wannian Xia, Xue Yan et al.

NEURIPS 2025posterarXiv:2510.12157

citations

StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs

Qijun Luo, Mengqi Li, Lei Zhao et al.

NEURIPS 2025posterarXiv:2506.03077

citations

Toward Understanding In-context vs. In-weight Learning

Bryan Chan, Xinyi Chen, Andras Gyorgy et al.

ICLR 2025posterarXiv:2410.23042

citations

TPP-SD: Accelerating Transformer Point Process Sampling with Speculative Decoding

Shukai Gong, YIYANG FU, Fengyuan Ran et al.

NEURIPS 2025oralarXiv:2507.09252

Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity

Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

ICLR 2025posterarXiv:2406.14479

citations