Poster "attention mechanism" Papers
281 papers found • Page 5 of 6
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition
Feng Lu, Xiangyuan Lan, Lijun Zhang et al.
Delving into Differentially Private Transformer
Youlong Ding, Xueyang Wu, Yining meng et al.
DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification
Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.
Do text-free diffusion models learn discriminative visual representations?
Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.
DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization
Jisu Nam, Heesu Kim, DongJae Lee et al.
DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion
Liao Shen, Tianqi Liu, Huiqiang Sun et al.
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
Peiqi Chen, Lei Yu, Yi Wan et al.
Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models
Yang Zhang, Tze Tzun Teoh, Wei Hern Lim et al.
EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning
Jongsuk Kim, Hyeongkeun Lee, Kyeongha Rho et al.
Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems
David T. Hoffmann, Simon Schrodi, Jelena Bratulić et al.
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
Yuwen Pan, Rui Sun, Naisong Luo et al.
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.
Free-Editor: Zero-shot Text-driven 3D Scene Editing
Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.
GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction
Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang et al.
Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates
Zhenqiao Song, Yunlong Zhao, Wenxian Shi et al.
Graph External Attention Enhanced Transformer
Jianqing Liang, Min Chen, Jiye Liang
Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning
Pengyu Li, Biao Wang, Tianchu Guo et al.
Grounded Text-to-Image Synthesis with Attention Refocusing
Quynh Phung, Songwei Ge, Jia-Bin Huang
HandDAGT: A Denoising Adaptive Graph Transformer for 3D Hand Pose Estimation
WENCAN CHENG, Eun-Ji Kim, Jong Hwan Ko
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
Yu Dai, Junchen Shen, Zijie Zhai et al.
How Smooth Is Attention?
Valérie Castin, Pierre Ablin, Gabriel Peyré
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
In-context Convergence of Transformers
Yu Huang, Yuan Cheng, Yingbin LIANG
In-Context Language Learning: Architectures and Algorithms
Ekin Akyürek, Bailin Wang, Yoon Kim et al.
In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation
Shiqi Chen, Miao Xiong, Junteng Liu et al.
InfoNet: Neural Estimation of Mutual Information without Test-Time Optimization
Zhengyang Hu, Song Kang, Qunsong Zeng et al.
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
I/O Complexity of Attention, or How Optimal is FlashAttention?
Barna Saha, Christopher Ye
Iterative Search Attribution for Deep Neural Networks
Zhiyu Zhu, Huaming Chen, Xinyi Wang et al.
KnowFormer: Revisiting Transformers for Knowledge Graph Reasoning
Junnan Liu, Qianren Mao, Weifeng Jiang et al.
Language-Assisted Skeleton Action Understanding for Skeleton-Based Temporal Action Segmentation
Haoyu Ji, Bowen Chen, Xinglong Xu et al.
Large Motion Model for Unified Multi-Modal Motion Generation
Mingyuan Zhang, Daisheng Jin, Chenyang Gu et al.
LeaPformer: Enabling Linear Transformers for Autoregressive and Simultaneous Tasks via Learned Proportions
Victor Agostinelli III, Sanghyun Hong, Lizhong Chen
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
Zhentao Tan, Yadong Mu
Learning with Unmasked Tokens Drives Stronger Vision Learners
Taekyung Kim, Sanghyuk Chun, Byeongho Heo et al.
MagicEraser: Erasing Any Objects via Semantics-Aware Control
FAN LI, Zixiao Zhang, Yi Huang et al.
Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression
Dingyuan Zhang, Dingkang Liang, Zichang Tan et al.
Memory Efficient Neural Processes via Constant Memory Attention Block
Leo Feng, Frederick Tung, Hossein Hajimirsadeghi et al.
Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas
Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli et al.
Meta Evidential Transformer for Few-Shot Open-Set Recognition
Hitesh Sapkota, Krishna Neupane, Qi Yu
Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers
Zhiyu Yao, Jian Wang, Haixu Wu et al.
MultiMax: Sparse and Multi-Modal Attention Learning
Yuxuan Zhou, Mario Fritz, Margret Keuper
Multi-Person Pose Forecasting with Individual Interaction Perceptron and Prior Learning
Peng Xiao, Yi Xie, Xuemiao Xu et al.
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Wanyun Li, Pinxue Guo, Xinyu Zhou et al.
Outlier-Efficient Hopfield Layers for Large Transformer-Based Models
Jerry Yao-Chieh Hu, Pei-Hsuan Chang, Haozheng Luo et al.
Parameter-Efficient Fine-Tuning with Controls
Chi Zhang, Jingpu Cheng, Yanyu Xu et al.
Paying More Attention to Images: A Training-Free Method for Alleviating Hallucination in LVLMs
Shi Liu, Kecheng Zheng, Wei Chen
PIDformer: Transformer Meets Control Theory
Tam Nguyen, Cesar Uribe, Tan Nguyen et al.
PinNet: Pinpoint Instructive Information for Retrieval Augmented Code-to-Text Generation
Han Fu, Jian Tan, Pinhan Zhang et al.