ECCV "attention mechanism" Papers

30 papers found

ADMap: Anti-disturbance Framework for Vectorized HD Map Construction

Haotian Hu, Fanyi Wang, Yaonong Wang et al.

ECCV 2024poster
5
citations

Agent Attention: On the Integration of Softmax and Linear Attention

Dongchen Han, Tianzhu Ye, Yizeng Han et al.

ECCV 2024posterarXiv:2312.08874
206
citations

An Efficient and Effective Transformer Decoder-Based Framework for Multi-Task Visual Grounding

Wei Chen, Long Chen, Yu Wu

ECCV 2024posterarXiv:2408.01120
16
citations

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Liang Chen, Haozhe Zhao, Tianyu Liu et al.

ECCV 2024posterarXiv:2403.06764
343
citations

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

Shenhao Zhu, Junming Chen, Zuozhuo Dai et al.

ECCV 2024posterarXiv:2403.14781
233
citations

CountFormer: Multi-View Crowd Counting Transformer

Hong Mo, Xiong Zhang, Jianchao Tan et al.

ECCV 2024posterarXiv:2407.02047
9
citations

DGR-MIL: Exploring Diverse Global Representation in Multiple Instance Learning for Whole Slide Image Classification

Wenhui Zhu, Xiwen Chen, Peijie Qiu et al.

ECCV 2024posterarXiv:2407.03575
24
citations

Do text-free diffusion models learn discriminative visual representations?

Soumik Mukhopadhyay, Matthew Gwilliam, Yosuke Yamaguchi et al.

ECCV 2024posterarXiv:2311.17921
26
citations

DreamMover: Leveraging the Prior of Diffusion Models for Image Interpolation with Large Motion

Liao Shen, Tianqi Liu, Huiqiang Sun et al.

ECCV 2024posterarXiv:2409.09605
13
citations

Enhancing Semantic Fidelity in Text-to-Image Synthesis: Attention Regulation in Diffusion Models

Yang Zhang, Tze Tzun Teoh, Wei Hern Lim et al.

ECCV 2024posterarXiv:2403.06381
19
citations

Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation

Yuwen Pan, Rui Sun, Naisong Luo et al.

ECCV 2024posterarXiv:2408.13838
5
citations

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.

ECCV 2024posterarXiv:2407.05578
7
citations

Free-Editor: Zero-shot Text-driven 3D Scene Editing

Md Nazmul Karim, Hasan Iqbal, Umar Khalid et al.

ECCV 2024posterarXiv:2312.13663
14
citations

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction

Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang et al.

ECCV 2024posterarXiv:2405.17429
95
citations

Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning

Pengyu Li, Biao Wang, Tianchu Guo et al.

ECCV 2024poster

HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras

Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.

ECCV 2024posterarXiv:2404.02517
19
citations

InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping

Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO

ECCV 2024posterarXiv:2308.08543
18
citations

Large Motion Model for Unified Multi-Modal Motion Generation

Mingyuan Zhang, Daisheng Jin, Chenyang Gu et al.

ECCV 2024posterarXiv:2404.01284
61
citations

Learning with Unmasked Tokens Drives Stronger Vision Learners

Taekyung Kim, Sanghyuk Chun, Byeongho Heo et al.

ECCV 2024posterarXiv:2310.13593
3
citations

Make Your ViT-based Multi-view 3D Detectors Faster via Token Compression

Dingyuan Zhang, Dingkang Liang, Zichang Tan et al.

ECCV 2024posterarXiv:2409.00633
4
citations

Merging and Splitting Diffusion Paths for Semantically Coherent Panoramas

Fabio Quattrini, Vittorio Pippi, Silvia Cascianelli et al.

ECCV 2024posterarXiv:2408.15660
6
citations

OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework

Wanyun Li, Pinxue Guo, Xinyu Zhou et al.

ECCV 2024posterarXiv:2403.08682
11
citations

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

Tongkun Guan, Chengyu Lin, Wei Shen et al.

ECCV 2024posterarXiv:2407.07764
16
citations

Relation DETR: Exploring Explicit Position Relation Prior for Object Detection

Xiuquan Hou, Meiqin Liu, Senlin Zhang et al.

ECCV 2024posterarXiv:2407.11699
61
citations

RPBG: Towards Robust Neural Point-based Graphics in the Wild

Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng et al.

ECCV 2024posterarXiv:2405.05663
5
citations

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Tim Salzmann, Markus Ryll, Alex Bewley et al.

ECCV 2024posterarXiv:2403.14270
8
citations

SpecFormer: Guarding Vision Transformer Robustness via Maximum Singular Value Penalization

Xixu Hu, Runkai Zheng, Jindong Wang et al.

ECCV 2024posterarXiv:2402.03317
5
citations

Stripe Observation Guided Inference Cost-free Attention Mechanism

Zhongzhan Huang, Shanshan Zhong, Wushao Wen et al.

ECCV 2024poster
1
citations

TexGen: Text-Guided 3D Texture Generation with Multi-view Sampling and Resampling

Dong Huo, Zixin Guo, Xinxin Zuo et al.

ECCV 2024posterarXiv:2408.01291
19
citations

Wavelength-Embedding-guided Filter-Array Transformer for Spectral Demosaicing

haijin zeng, Hiep Luong, Wilfried Philips

ECCV 2024poster
1
citations