Lin Song

14

Papers

6

Total Citations

Papers (14)

HaploVL: A Single-Transformer Baseline for Multi-Modal Understanding

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio Video Point Cloud Time-Series and Image Recognition

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection

Learning Dynamic Routing for Semantic Segmentation

End-to-End Object Detection With Fully Convolutional Network

BoxSnake: Polygonal Instance Segmentation with Box Supervision

YOLO-World: Real-Time Open-Vocabulary Object Detection

Learnable Tree Filter for Structure-preserving Feature Transform

Rethinking Learnable Tree Filter for Generic Feature Transform

Fine-Grained Dynamic Head for Object Detection

Dynamic Grained Encoder for Vision Transformers

Meta-Adapter: An Online Few-shot Learner for Vision-Language Model

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction