Poster "transformer architecture" Papers
150 papers found • Page 3 of 3
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
I/O Complexity of Attention, or How Optimal is FlashAttention?
Barna Saha, Christopher Ye
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Junnan Liu, Qianren Mao, Weifeng Jiang et al.
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Zhentao Tan, Yadong Mu
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
guangyan li, Yongqiang Tang, Wensheng Zhang
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang, Li Shen, Yong Luo et al.
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Hitesh Sapkota, Krishna Neupane, Qi Yu
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
Shuying Huang, Mingyang Ren, Yong Yang et al.
Modeling Language Tokens as Functionals of Semantic Fields
Zhengqi Pei, Anran Zhang, Shuhui Wang et al.
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
Pranav Singh Chib, Achintya Nath, Paritosh Kabra et al.
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Amutheezan Sivagnanam, Ava Pettet, Hunter Lee et al.
OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction
Yini Fang, Jingling Yu, Haozheng Zhang et al.
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
PIDformer: Transformer Meets Control Theory
Tam Nguyen, Cesar Uribe, Tan Nguyen et al.
Polynomial-based Self-Attention for Table Representation Learning
Jayoung Kim, Yehjin Shin, Jeongwhan Choi et al.
Positional Knowledge is All You Need: Position-induced Transformer (PiT) for Operator Learning
Junfeng CHEN, Kailiang Wu
Position: Do pretrained Transformers Learn In-Context by Gradient Descent?
Lingfeng Shen, Aayush Mishra, Daniel Khashabi
Position: Stop Making Unscientific AGI Performance Claims
Patrick Altmeyer, Andrew Demetriou, Antony Bartlett et al.
Progressive Pretext Task Learning for Human Trajectory Prediction
Xiaotong Lin, Tianming Liang, Jian-Huang Lai et al.
Prompting a Pretrained Transformer Can Be a Universal Approximator
Aleksandar Petrov, Phil Torr, Adel Bibi
Prototypical Transformer As Unified Motion Learners
Cheng Han, Yawen Lu, Guohao Sun et al.
Recurrent Early Exits for Federated Learning with Heterogeneous Clients
Royson Lee, Javier Fernandez-Marques, Xu Hu et al.
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi, David Brandfonbrener, Sham Kakade et al.
Rethinking Decision Transformer via Hierarchical Reinforcement Learning
Yi Ma, Jianye Hao, Hebin Liang et al.
Rethinking Transformers in Solving POMDPs
Chenhao Lu, Ruizhe Shi, Yuyao Liu et al.
SAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
Romain Ilbert, Ambroise Odonnat, Vasilii Feofanov et al.
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson, Stefan Baumann, Alex Birch et al.
Self-Attention through Kernel-Eigen Pair Sparse Variational Gaussian Processes
Yingyi Chen, Qinghua Tao, Francesco Tonin et al.
Slot Abstractors: Toward Scalable Abstract Visual Reasoning
Shanka Subhra Mondal, Jonathan Cohen, Taylor Webb
SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution
mingjun zheng, Long Sun, Jiangxin Dong et al.
SpikeZIP-TF: Conversion is All You Need for Transformer-based SNN
kang you, Zekai Xu, Chen Nie et al.
Surface-VQMAE: Vector-quantized Masked Auto-encoders on Molecular Surfaces
Fang Wu, Stan Z Li
Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts
Byeongjun Park, Hyojun Go, Jin-Young Kim et al.
Text-Conditioned Resampler For Long Form Video Understanding
Bruno Korbar, Yongqin Xian, Alessio Tonioni et al.
The Illusion of State in State-Space Models
William Merrill, Jackson Petty, Ashish Sabharwal
The Pitfalls of Next-Token Prediction
Gregor Bachmann, Vaishnavh Nagarajan
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.
Towards Efficient Spiking Transformer: a Token Sparsification Framework for Training and Inference Acceleration
Zhengyang Zhuge, Peisong Wang, Xingting Yao et al.
Towards General Algorithm Discovery for Combinatorial Optimization: Learning Symbolic Branching Policy from Bipartite Graph
Yufei Kuang, Jie Wang, Yuyan Zhou et al.
Towards Understanding Inductive Bias in Transformers: A View From Infinity
Itay Lavie, Guy Gur-Ari, Zohar Ringel
Towards Understanding the Word Sensitivity of Attention Layers: A Study via Random Features
Simone Bombari, Marco Mondelli
Trainable Transformer in Transformer
Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia et al.
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Tri Dao, Albert Gu
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim, Taiji Suzuki
Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention
Zhen Qin, Weigao Sun, Dong Li et al.
Viewing Transformers Through the Lens of Long Convolutions Layers
Itamar Zimerman, Lior Wolf
Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing
haijin zeng, Hiep Luong, Wilfried Philips
What Can Transformer Learn with Varying Depth? Case Studies on Sequence Learning Tasks
Xingwu Chen, Difan Zou
When Fast Fourier Transform Meets Transformer for Image Restoration
xingyu jiang, Xiuhui Zhang, Ning Gao et al.
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Haoran You, Yichao Fu, Zheng Wang et al.