Xu Sun

4

Papers

424

Total Citations

Papers (4)

TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

VITATECS: A Diagnostic Dataset for Temporal Concept Understanding of Video-Language Models

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

VidTwin: Video VAE with Decoupled Structure and Dynamics