Poster "attention mechanism" Papers
272 papers found • Page 3 of 6
MAESTRO: Masked Encoding Set Transformer with Self-Distillation
Matthew Lee, Jaesik Kim, Matei Ionita et al.
MambaIRv2: Attentive State Space Restoration
Hang Guo, Yong Guo, Yaohua Zha et al.
Mamba Modulation: On the Length Generalization of Mamba Models
Peng Lu, Jerry Huang, QIUHAO Zeng et al.
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu, Xinchao Wang
MeCeFO: Enhancing LLM Training Robustness via Fault-Tolerant Optimization
Rizhen Hu, Yutong He, Ran Yan et al.
MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.
Mimic In-Context Learning for Multimodal Tasks
Yuchu Jiang, Jiale Fu, chenduo hao et al.
Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules
Yueqi Zhang, Peiwen Yuan, Yiwei Li et al.
Mixture of Attentions For Speculative Decoding
Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.
MoEdit: On Learning Quantity Perception for Multi-object Image Editing
Yanfeng Li, Ka-Hou Chan, Yue Sun et al.
MoE-Gyro: Self-Supervised Over-Range Reconstruction and Denoising for MEMS Gyroscopes
Feiyang Pan, Shenghe Zheng, Chunyan Yin et al.
MoFo: Empowering Long-term Time Series Forecasting with Periodic Pattern Modeling
Jiaming Ma, Binwu Wang, Qihe Huang et al.
MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation Models
Yifan Liu, Keyu Fan, Weihao Yu et al.
Multi-Kernel Correlation-Attention Vision Transformer for Enhanced Contextual Understanding and Multi-Scale Integration
Hongkang Zhang, Shao-Lun Huang, Ercan KURUOGLU et al.
Multi-party Collaborative Attention Control for Image Customization
Han Yang, Chuanguang Yang, Qiuli Wang et al.
Multipole Attention for Efficient Long Context Reasoning
Coleman Hooper, Sebastian Zhao, Luca Manolache et al.
Multi-turn Consistent Image Editing
Zijun Zhou, Yingying Deng, Xiangyu He et al.
MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation
Aviral Chharia, Wenbo Gou, Haoye Dong
MVTrajecter: Multi-View Pedestrian Tracking with Trajectory Motion Cost and Trajectory Appearance Cost
Taiga Yamane, Ryo Masumura, Satoshi Suzuki et al.
Neural Attention Search
Difan Deng, Marius Lindauer
Neural networks on Symmetric Spaces of Noncompact Type
Xuan Son Nguyen, Yang, Aymeric Histace
O-MaMa: Learning Object Mask Matching between Egocentric and Exocentric Views
Lorenzo Mur-Labadia, Maria Santos-Villafranca, Jesus Bermudez-cameo et al.
One-Minute Video Generation with Test-Time Training
Jiarui Xu, Shihao Han, Karan Dalal et al.
On the Optimization and Generalization of Multi-head Attention
Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.
PatchGuard: Adversarially Robust Anomaly Detection and Localization through Vision Transformers and Pseudo Anomalies
Mojtaba Nafez, Amirhossein Koochakian, Arad Maleki et al.
Pinpointing Attention-Causal Communication in Language Models
Gabriel Franco, Mark Crovella
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity
Kwanyoung Kim, Byeongsu Sim
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng, Yadan Luo, Xin Li et al.
Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity
Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.
Principles of Visual Tokens for Efficient Video Understanding
Xinyue Hao, Li, Shreyank Gowda et al.
RANK++LETR: Learn to Rank and Optimize Candidates for Line Segment Detection
Xin Tong, Baojie Tian, Yufei Guo et al.
RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence Modeling
Xiuying Wei, Anunay Yadav, Razvan Pascanu et al.
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Jinhong Deng, Yuhang Yang, Wen Li et al.
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.
Rethinking the role of frames for SE(3)-invariant crystal structure modeling
Yusei Ito, Tatsunori Taniai, Ryo Igarashi et al.
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
Di Liu, Meng Chen, Baotong Lu et al.
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.
Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology
Wenhao Tang, Rong Qin, Heng Fang et al.
Robust Message Embedding via Attention Flow-Based Steganography
Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.
SAS: Simulated Attention Score
Chuanyang Zheng, Jiankai Sun, Yihang Gao et al.
Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
feilong tang, Chengzhi Liu, Zhongxing Xu et al.
SeerAttention: Self-distilled Attention Gating for Efficient Long-context Prefilling
Yizhao Gao, Zhichen Zeng, DaYou Du et al.
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang, Jinyeong Kim, Junhyeok Kim et al.
Selective Attention Improves Transformer
Yaniv Leviathan, Matan Kalman, Yossi Matias
Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens
Qihang Fan, Huaibo Huang, Mingrui Chen et al.
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.
SpiLiFormer: Enhancing Spiking Transformers with Lateral Inhibition
Zeqi Zheng, Yanchen Huang, Yingchao Yu et al.
STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks
Tianqing Zhang, Kairong Yu, Xian Zhong et al.
StarTrail: Concentric Ring Sequence Parallelism for Efficient Near-Infinite-Context Transformer Model Training
Ziming Liu, Shaoyu Wang, Shenggan Cheng et al.