Dawei Leng

6

Papers

46

Total Citations

Papers (6)

WISA: World simulator assistant for physics-aware text-to-video generation

NeurIPS 2025arXiv

PlanGen: Towards Unified Layout Planning and Image Generation in Auto-Regressive Vision Language Models

Bridge Diffusion Model: Bridge Chinese Text-to-Image Diffusion Model with English Communities

Prompt as Knowledge Bank: Boost Vision-language model via Structural Representation for zero-shot medical detection

LMM-Det: Make Large Multimodal Models Excel in Object Detection

IAA: Inner-Adaptor Architecture Empowers Frozen Large Language Model with Multimodal Capabilities