Jianwei Yang
11
Papers
562
Total Citations
Papers (11)
Segment and Recognize Anything at Any Granularity
ECCV 2024arXiv
226
citations
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
ECCV 2024arXiv
114
citations
Matryoshka Multimodal Models
ICLR 2025arXiv
58
citations
Visual In-Context Prompting
CVPR 2024
52
citations
VCoder: Versatile Vision Encoders for Multimodal Large Language Models
CVPR 2024
48
citations
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding
ICML 2025
44
citations
Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion
CVPR 2025
15
citations
Pix2Gif: Motion-Guided Diffusion for GIF Generation
ECCV 2024arXiv
5
citations
SITE: towards Spatial Intelligence Thorough Evaluation
ICCV 2025
0
citations
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
CVPR 2025
0
citations
Magma: A Foundation Model for Multimodal AI Agents
CVPR 2025
0
citations