Xinliang Wang
3
papers
131
total citations
papers (3)
ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions
CVPR 2024arXiv
131
citations
LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
ICCV 2025arXiv
0
citations
Lane Detection Transformer Based on Multi-Frame Horizontal and Vertical Attention and Visual Transformer Module
ECCV 2022
0
citations