Li Dong

19

Papers

1,376

Total Citations

1

Affiliations

Affiliations

Microsoft Research

Papers (19)

Grounding Multimodal Large Language Models to the World

BioCLIP: A Vision Foundation Model for the Tree of Life

Imagine While Reasoning in Space: Multimodal Visualization-of-Thought

Think Only When You Need with Large Hybrid-Reasoning Models

NeurIPS 2025arXiv

Self-Boosting Large Language Models with Synthetic Preference Data

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

Learning Robust Image Watermarking with Lossless Cover Recovery

Swin Transformer V2: Scaling Up Capacity and Resolution

Image as a Foreign Language: BEiT Pretraining for Vision and Vision-Language Tasks

Generic-to-Specific Distillation of Masked Autoencoders

Non-Contrastive Learning Meets Language-Image Pre-Training

Unified Language Model Pre-training for Natural Language Understanding and Generation

MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts

On the Representation Collapse of Sparse Mixture of Experts

Extensible Prompts for Language Models on Zero-shot Language Style Customization

Optimizing Prompts for Text-to-Image Generation

Language Is Not All You Need: Aligning Perception with Language Models

Augmenting Language Models with Long-Term Memory