"transformer architecture" Papers
137 papers found • Page 2 of 3
A Comparative Study of Image Restoration Networks for General Backbone Network Design
Xiangyu Chen, Zheyuan Li, Yuandong Pu et al.
ALERT-Transformer: Bridging Asynchronous and Synchronous Machine Learning for Real-Time Event-based Spatio-Temporal Data
Carmen Martin-Turrero, Maxence Bouvier, Manuel Breitenstein et al.
An Incremental Unified Framework for Small Defect Inspection
Jiaqi Tang, Hao Lu, Xiaogang Xu et al.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification
Jiaer Xia, Lei Tan, Pingyang Dai et al.
Attention Meets Post-hoc Interpretability: A Mathematical Perspective
Gianluigi Lopardo, Frederic Precioso, Damien Garreau
Auctionformer: A Unified Deep Learning Algorithm for Solving Equilibrium Strategies in Auction Games
Kexin Huang, Ziqian Chen, xue wang et al.
AVSegFormer: Audio-Visual Segmentation with Transformer
Shengyi Gao, Zhe Chen, Guo Chen et al.
Breaking through the learning plateaus of in-context learning in Transformer
Jingwen Fu, Tao Yang, Yuwang Wang et al.
Bridging the Gap between 2D and 3D Visual Question Answering: A Fusion Approach for 3D VQA
Wentao Mo, Yang Liu
CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi Hamdan, Fatma Guney
Converting Transformers to Polynomial Form for Secure Inference Over Homomorphic Encryption
Itamar Zimerman, Moran Baruch, Nir Drucker et al.
Correlation Matching Transformation Transformers for UHD Image Restoration
Cong Wang, Jinshan Pan, Wei Wang et al.
Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Zheng Xiong, Risto Vuorio, Jacob Beck et al.
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
Exploring Transformer Extrapolation
Zhen Qin, Yiran Zhong, Hui Deng
Gated Linear Attention Transformers with Hardware-Efficient Training
Songlin Yang, Bailin Wang, Yikang Shen et al.
GeoMFormer: A General Architecture for Geometric Molecular Representation Learning
Tianlang Chen, Shengjie Luo, Di He et al.
Graph External Attention Enhanced Transformer
Jianqing Liang, Min Chen, Jiye Liang
GridFormer: Point-Grid Transformer for Surface Reconstruction
Shengtao Li, Ge Gao, Yudong Liu et al.
HDformer: A Higher
Dimensional Transformer for Detecting Diabetes Utilizing Long-Range Vascular Signals - Ella Lan
How do Transformers Perform In-Context Autoregressive Learning ?
Michael Sander, Raja Giryes, Taiji Suzuki et al.
How Smooth Is Attention?
Valérie Castin, Pierre Ablin, Gabriel Peyré
How to Protect Copyright Data in Optimization of Large Language Models?
Timothy Chu, Zhao Song, Chiwun Yang
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
HybridGait: A Benchmark for Spatial-Temporal Cloth-Changing Gait Recognition with Hybrid Explorations
Yilan Dong, Chunlin Yu, Ruiyang Ha et al.
Improving Transformers with Dynamically Composable Multi-Head Attention
Da Xiao, Qingye Meng, Shengping Li et al.
In-Context Freeze-Thaw Bayesian Optimization for Hyperparameter Optimization
Herilalaina Rakotoarison, Steven Adriaensen, Neeratyoy Mallik et al.
In-Context Language Learning: Architectures and Algorithms
Ekin Akyürek, Bailin Wang, Yoon Kim et al.
In-context Learning on Function Classes Unveiled for Transformers
Zhijie Wang, Bo Jiang, Shuai Li
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
I/O Complexity of Attention, or How Optimal is FlashAttention?
Barna Saha, Christopher Ye
Jointly Modeling Spatio-Temporal Features of Tactile Signals for Action Classification
Jimmy Lin, Junkai Li, Jiasi Gao et al.
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Junnan Liu, Qianren Mao, Weifeng Jiang et al.
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Zhentao Tan, Yadong Mu
Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer
Toru Shirakawa, Yi Li, Yulun Wu et al.
LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models
guangyan li, Yongqiang Tang, Wensheng Zhang
MASTER: Market-Guided Stock Transformer for Stock Price Forecasting
Tong Li, Zhaoyang Liu, Yanyan Shen et al.
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang, Li Shen, Yong Luo et al.
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Hitesh Sapkota, Krishna Neupane, Qi Yu
MFTN: A Multi-scale Feature Transfer Network Based on IMatchFormer for Hyperspectral Image Super-Resolution
Shuying Huang, Mingyang Ren, Yong Yang et al.
Modeling Language Tokens as Functionals of Semantic Fields
Zhengqi Pei, Anran Zhang, Shuhui Wang et al.
MS-TIP: Imputation Aware Pedestrian Trajectory Prediction
Pranav Singh Chib, Achintya Nath, Paritosh Kabra et al.
Multi-Agent Reinforcement Learning with Hierarchical Coordination for Emergency Responder Stationing
Amutheezan Sivagnanam, Ava Pettet, Hunter Lee et al.
Neural Reasoning about Agents’ Goals, Preferences, and Actions
Matteo Bortoletto, Lei Shi, Andreas Bulling
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision
Ishan Rajendrakumar Dave, Simon Jenni, Mubarak Shah
OAT: Object-Level Attention Transformer for Gaze Scanpath Prediction
Yini Fang, Jingling Yu, Haozheng Zhang et al.
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
PIDformer: Transformer Meets Control Theory
Tam Nguyen, Cesar Uribe, Tan Nguyen et al.
Polynomial-based Self-Attention for Table Representation Learning
Jayoung Kim, Yehjin Shin, Jeongwhan Choi et al.