Poster "attention mechanism" Papers

272 papers found • Page 4 of 6

StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text

Roberto Henschel, Levon Khachatryan, Hayk Poghosyan et al.

CVPR 2025posterarXiv:2403.14773
157
citations

Stronger, Steadier & Superior: Geometric Consistency in Depth VFM Forges Domain Generalized Semantic Segmentation

Siyu Chen, Ting Han, Changshe Zhang et al.

ICCV 2025posterarXiv:2504.12753
1
citations

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025posterarXiv:2502.06415
15
citations

TAB: Transformer Attention Bottlenecks enable User Intervention and Debugging in Vision-Language Models

Pooyan Rahmanzadehgervi, Hung Nguyen, Rosanne Liu et al.

ICCV 2025posterarXiv:2412.18675
1
citations

Text to Sketch Generation with Multi-Styles

Tengjie Li, Shikui Tu, Lei Xu

NEURIPS 2025posterarXiv:2511.04123

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.

NEURIPS 2025posterarXiv:2503.14376
8
citations

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem et al.

ICLR 2025posterarXiv:2410.23168
10
citations

Token Statistics Transformer: Linear-Time Attention via Variational Rate Reduction

Ziyang Wu, Tianjiao Ding, Yifu Lu et al.

ICLR 2025posterarXiv:2412.17810
26
citations

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NEURIPS 2025posterarXiv:2505.17771
4
citations

Towards Interpretability Without Sacrifice: Faithful Dense Layer Decomposition with Mixture of Decoders

James Oldfield, Shawn Im, Sharon Li et al.

NEURIPS 2025posterarXiv:2505.21364

Training-free and Adaptive Sparse Attention for Efficient Long Video Generation

yifei xia, Suhan Ling, Fangcheng Fu et al.

ICCV 2025posterarXiv:2502.21079
32
citations

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

Yizhou Huang, Yihua Cheng, Kezhi Wang

CVPR 2025posterarXiv:2503.10898
9
citations

Transformer Learns Optimal Variable Selection in Group-Sparse Classification

Chenyang Zhang, Xuran Meng, Yuan Cao

ICLR 2025posterarXiv:2504.08638
4
citations

Transformers Learn Faster with Semantic Focus

Parikshit Ram, Kenneth Clarkson, Tim Klinger et al.

NEURIPS 2025posterarXiv:2506.14095

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

Jianhao Huang, Zixuan Wang, Jason Lee

ICLR 2025posterarXiv:2502.21212
18
citations

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Zhu Xu, Ting Lei, Zhimin Li et al.

ICCV 2025posterarXiv:2508.04943

Tropical Attention: Neural Algorithmic Reasoning for Combinatorial Algorithms

Baran Hashemi, Kurt Pasque, Chris Teska et al.

NEURIPS 2025posterarXiv:2505.17190
4
citations

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

Haoxuan Wang, Jinlong Peng, Qingdong He et al.

ICCV 2025posterarXiv:2503.09277
16
citations

Unleashing Foundation Vision Models: Adaptive Transfer for Diverse Data-Limited Scientific Domains

Qiankun Li, Feng He, Huabao Chen et al.

NEURIPS 2025posterarXiv:2512.22664
1
citations

URWKV: Unified RWKV Model with Multi-state Perspective for Low-light Image Restoration

Rui Xu, Yuzhen Niu, Yuezhou Li et al.

CVPR 2025posterarXiv:2505.23068
4
citations

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025posterarXiv:2508.05211
3
citations

VideoGrain: Modulating Space-Time Attention for Multi-Grained Video Editing

Xiangpeng Yang, Linchao Zhu, Hehe Fan et al.

ICLR 2025posterarXiv:2502.17258
31
citations

Video Motion Transfer with Diffusion Transformers

Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov et al.

CVPR 2025posterarXiv:2412.07776
18
citations

Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models

Jinhui Yi, Syed Talal Wasim, Yanan Luo et al.

CVPR 2025posterarXiv:2412.18609
1
citations

Video Summarization with Large Language Models

Min Jung Lee, Dayoung Gong, Minsu Cho

CVPR 2025posterarXiv:2504.11199
8
citations

ViewPoint: Panoramic Video Generation with Pretrained Diffusion Models

Zixun Fang, Kai Zhu, Zhiheng Liu et al.

NEURIPS 2025posterarXiv:2506.23513

ViSpec: Accelerating Vision-Language Models with Vision-Aware Speculative Decoding

Jialiang Kang, Han Shu, Wenshuo Li et al.

NEURIPS 2025posterarXiv:2509.15235
2
citations

ViT-EnsembleAttack: Augmenting Ensemble Models for Stronger Adversarial Transferability in Vision Transformers

Hanwen Cao, Haobo Lu, Xiaosen Wang et al.

ICCV 2025posterarXiv:2508.12384
1
citations

What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

Weronika Ormaniec, Felix Dangel, Sidak Pal Singh

ICLR 2025posterarXiv:2410.10986
10
citations

What We Miss Matters: Learning from the Overlooked in Point Cloud Transformers

Yi Wang, Jiaze Wang, Ziyu Guo et al.

NEURIPS 2025poster

Why RoPE Struggles to Maintain Long-Term Decay in Long Sequences?

Wei Shen, Chao Yin, Yuliang Liu et al.

ICLR 2025poster

ADMap: Anti-disturbance Framework for Vectorized HD Map Construction

Haotian Hu, Fanyi Wang, Yaonong Wang et al.

ECCV 2024poster
5
citations

A Fixed-Point Approach for Causal Generative Modeling

Meyer Scetbon, Joel Jennings, Agrin Hilmkil et al.

ICML 2024posterarXiv:2404.06969

Agent Attention: On the Integration of Softmax and Linear Attention

Dongchen Han, Tianzhu Ye, Yizeng Han et al.

ECCV 2024posterarXiv:2312.08874
206
citations

Algorithm and Hardness for Dynamic Attention Maintenance in Large Language Models

Jan van den Brand, Zhao Song, Tianyi Zhou

ICML 2024posterarXiv:2304.02207

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

Wei Chen, Long Chen, Yu Wu

ECCV 2024posterarXiv:2408.01120
16
citations

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Liang Chen, Haozhe Zhao, Tianyu Liu et al.

ECCV 2024posterarXiv:2403.06764
343
citations

Anytime Continual Learning for Open Vocabulary Classification

Zhen Zhu, Yiming Gong, Derek Hoiem

ECCV 2024posterarXiv:2409.08518
8
citations

Attention Meets Post-hoc Interpretability: A Mathematical Perspective

Gianluigi Lopardo, Frederic Precioso, Damien Garreau

ICML 2024posterarXiv:2402.03485

AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers

Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer et al.

ICML 2024posterarXiv:2402.05602

AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios

Zhongzhan Huang, Mingfu Liang, Shanshan Zhong et al.

ICML 2024posterarXiv:2302.10184

Bifurcated Attention for Single-Context Large-Batch Sampling

Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda et al.

ICML 2024poster

CHAI: Clustered Head Attention for Efficient LLM Inference

Saurabh Agarwal, Bilge Acun, Basil Hosmer et al.

ICML 2024posterarXiv:2403.08058

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Shenhao Zhu, Junming Chen, Zuozhuo Dai et al.

ECCV 2024posterarXiv:2403.14781
233
citations

Compositional Capabilities of Autoregressive Transformers: A Study on Synthetic, Interpretable Tasks

Rahul Ramesh, Ekdeep Singh Lubana, Mikail Khona et al.

ICML 2024posterarXiv:2311.12997

CountFormer: Multi-View Crowd Counting Transformer

Hong Mo, Xiong Zhang, Jianchao Tan et al.

ECCV 2024posterarXiv:2407.02047
9
citations

Delving into Differentially Private Transformer

Youlong Ding, Xueyang Wu, Yining meng et al.

ICML 2024posterarXiv:2405.18194

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.

ECCV 2024posterarXiv:2407.03575
24
citations

Do text-free diffusion models learn discriminative visual representations?

Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.

ECCV 2024posterarXiv:2311.17921
26
citations

DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion

Liao Shen, Tianqi Liu, Huiqiang Sun et al.

ECCV 2024posterarXiv:2409.09605
13
citations