Li Dong

19
Papers
1,376
Total Citations
1
Affiliations

Affiliations

Microsoft Research

Papers (19)

Grounding Multimodal Large Language Models to the World

ICLR 2024
1,032
citations

BioCLIP: A Vision Foundation Model for the Tree of Life

CVPR 2024
165
citations

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

ICML 2025
115
citations

Think Only When You Need with Large Hybrid-Reasoning Models

NeurIPS 2025arXiv
35
citations

Self-Boosting Large Language Models with Synthetic Preference Data

ICLR 2025
29
citations

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

ECCV 2020
0
citations

Learning Robust Image Watermarking with Lossless Cover Recovery

ICCV 2025
0
citations

Swin Transformer V2: Scaling Up Capacity and Resolution

CVPR 2022arXiv
0
citations

Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks

CVPR 2023
0
citations

Generic-to-Specific Distillation of Masked Autoencoders

CVPR 2023arXiv
0
citations

Non-Contrastive Learning Meets Language-Image Pre-Training

CVPR 2023arXiv
0
citations

Unified Language Model Pre-training for Natural Language Understanding and Generation

NeurIPS 2019
0
citations

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

NeurIPS 2020
0
citations

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts

NeurIPS 2022
0
citations

On the Representation Collapse of Sparse Mixture of Experts

NeurIPS 2022
0
citations

Extensible Prompts for Language Models on Zero-shot Language Style Customization

NeurIPS 2023
0
citations

Optimizing Prompts for Text-to-Image Generation

NeurIPS 2023
0
citations

Language Is Not All You Need: Aligning Perception with Language Models

NeurIPS 2023
0
citations

Augmenting Language Models with Long-Term Memory

NeurIPS 2023
0
citations