Poster "transformer models" Papers
12 papers found
Can In-context Learning Really Generalize to Out-of-distribution Tasks?
Qixun Wang, Yifei Wang, Xianghua Ying et al.
ICLR 2025posterarXiv:2410.09695
15
citations
Geometry of Decision Making in Language Models
Abhinav Joshi, Divyanshu Bhatt, Ashutosh Modi
NeurIPS 2025posterarXiv:2511.20315
LoTUS: Large-Scale Machine Unlearning with a Taste of Uncertainty
Christoforos N. Spartalis, Theodoros Semertzidis, Efstratios Gavves et al.
CVPR 2025posterarXiv:2503.18314
8
citations
Mixture of Parrots: Experts improve memorization more than reasoning
Samy Jelassi, Clara Mohri, David Brandfonbrener et al.
ICLR 2025posterarXiv:2410.19034
14
citations
Self-Verifying Reflection Helps Transformers with CoT Reasoning
Zhongwei Yu, Wannian Xia, Xue Yan et al.
NeurIPS 2025posterarXiv:2510.12157
1
citations
Case-Based or Rule-Based: How Do Transformers Do the Math?
Yi Hu, Xiaojuan Tang, Haotong Yang et al.
ICML 2024poster
Delving into Differentially Private Transformer
Youlong Ding, Xueyang Wu, Yining meng et al.
ICML 2024poster
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu, Zhanpeng Zeng, Li Zhang et al.
ICML 2024poster
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman, Andrew Lampinen, Lucas Dixon et al.
ICML 2024poster
Learning Associative Memories with Gradient Descent
Vivien Cabannnes, Berfin Simsek, Alberto Bietti
ICML 2024poster
MoMo: Momentum Models for Adaptive Learning Rates
Fabian Schaipp, Ruben Ohana, Michael Eickenberg et al.
ICML 2024poster
Towards an Understanding of Stepwise Inference in Transformers: A Synthetic Graph Navigation Model
Mikail Khona, Maya Okawa, Jan Hula et al.
ICML 2024poster