"transformer architectures" Papers

14 papers found

Filters:transformer architectures Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

Attention on the Sphere

Boris Bonev, Max Rietmann, Andrea Paris et al.

NeurIPS 2025posterarXiv:2505.11157

Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization

Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio

CVPR 2025posterarXiv:2503.03519

citations

EUGens: Efficient, Unified and General Dense Layers

Sang Min Kim, Byeongchan Kim, Arijit Sehanobish et al.

NeurIPS 2025poster

Learning in Compact Spaces with Approximately Normalized Transformer

Jörg Franke, Urs Spiegelhalter, Marianna Nezhurina et al.

NeurIPS 2025posterarXiv:2505.22014

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Nikola Zubic, Federico Soldà, Aurelio Sulser et al.

ICLR 2025posterarXiv:2405.16674

citations

L-SWAG: Layer-Sample Wise Activation with Gradients Information for Zero-Shot NAS on Vision Transformers

Sofia Casarin, Sergio Escalera, Oswald Lanz

CVPR 2025posterarXiv:2505.07300

citations

Optimal Brain Apoptosis

Mingyuan Sun, Zheng Fang, Jiaxu Wang et al.

ICLR 2025posterarXiv:2502.17941

citations

Scaling and context steer LLMs along the same computational path as the human brain

Joséphine Raugel, Jérémy Rapin, Stéphane d'Ascoli et al.

NeurIPS 2025oralarXiv:2512.01591

All-in-one simulation-based inference

Manuel Gloeckler, Michael Deistler, Christian Weilbach et al.

ICML 2024poster

Controllable Prompt Tuning For Balancing Group Distributional Robustness

Hoang Phan, Andrew Wilson, Qi Lei

ICML 2024poster

Improving Token-Based World Models with Parallel Observation Prediction

Lior Cohen, Kaixin Wang, Bingyi Kang et al.

ICML 2024poster

Loss Shaping Constraints for Long-Term Time Series Forecasting

Ignacio Hounie, Javier Porras-Valenzuela, Alejandro Ribeiro

ICML 2024poster

Outlier-aware Slicing for Post-Training Quantization in Vision Transformer

Yuexiao Ma, Huixia Li, Xiawu Zheng et al.

ICML 2024poster

Towards Interpretable Deep Local Learning with Successive Gradient Reconciliation

Yibo Yang, Xiaojie Li, Motasem Alfarra et al.

ICML 2024poster