2024 "transformer architecture" Papers
93 papers found • Page 2 of 2
OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction
Yini Fang, Jingling Yu, Haozheng Zhang et al.
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
PIDformer: Transformer Meets Control Theory
Tam Nguyen, Cesar Uribe, Tan Nguyen et al.
Polynomial-based Self-Attention for Table Representation Learning
Jayoung Kim, Yehjin Shin, Jeongwhan Choi et al.
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
Junfeng CHEN, Kailiang Wu
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen, Aayush Mishra, Daniel Khashabi
Position: Stop Making Unscientific AGI Performance Claims
Patrick Altmeyer, Andrew Demetriou, Antony Bartlett et al.
Prompting a Pretrained Transformer Can Be a Universal Approximator
Aleksandar Petrov, Phil Torr, Adel Bibi
Prototypical Transformer As Unified Motion Learners
Cheng Han, Yawen Lu, Guohao Sun et al.
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
Royson Lee, Javier Fernandez-Marques, Xu Hu et al.
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi, David Brandfonbrener, Sham Kakade et al.
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Yi Ma, Jianye Hao, Hebin Liang et al.
Rethinking Transformers in Solving POMDPs
Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov et al.
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson, Stefan Baumann, Alex Birch et al.
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Yingyi Chen, Qinghua Tao, Francesco Tonin et al.
SelfPromer: Self-Prompt Dehazing Transformers with Depth-Consistency
8137 Feiyu Zhu, Reid Simmons
SeTformer Is What You Need for Vision and Language
Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger et al.
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
Shanka Subhra Mondal, Jonathan Cohen, Taylor Webb
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
kang you, Zekai Xu, Chen Nie et al.
Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces
Fang Wu, Stan Z Li
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Byeongjun Park, Hyojun Go, Jin-Young Kim et al.
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.
The Illusion of State in State-Space Models
William Merrill, Jackson Petty, Ashish Sabharwal
The Pitfalls of Next-Token Prediction
Gregor Bachmann, Vaishnavh Nagarajan
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
Zhengyang Zhuge, Peisong Wang, Xingting Yao et al.
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
Yufei Kuang, Jie Wang, Yuyan Zhou et al.
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie, Guy Gur-Ari, Zohar Ringel
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari, Marco Mondelli
Trainable Transformer in Transformer
Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia et al.
Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Jinsong Shi, Pan Gao, Jie Qin
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Tri Dao, Albert Gu
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim, Taiji Suzuki
Translation Equivariant Transformer Neural Processes
Matthew Ashman, Cristiana Diaconu, Junhyuck Kim et al.
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Haixu Wu, Huakun Luo, Haowen Wang et al.
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin, Weigao Sun, Dong Li et al.
Viewing Transformers Through the Lens of Long Convolutions Layers
Itamar Zimerman, Lior Wolf
VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning
Tangfei Liao, Xiaoqin Zhang, Li Zhao et al.
Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing
haijin zeng, Hiep Luong, Wilfried Philips
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen, Difan Zou
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You, Yichao Fu, Zheng Wang et al.
X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer
Linglin Jing, Ying Xue, Xu Yan et al.