Xie Chen
9
Papers
173
Total Citations
Papers (9)
ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering
AAAI 2025
64
citations
MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
NeurIPS 2025arXiv
52
citations
Language Model Can Listen While Speaking
AAAI 2025
47
citations
Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration
AAAI 2025
10
citations
Bitrate-Controlled Diffusion for Disentangling Motion and Content in Video
ICCV 2025
0
citations
BAT: Learning to Reason about Spatial Sounds with Large Language Models
ICML 2024
0
citations
Word-Level Emotional Expression Control in Zero-Shot Text-to-Speech Synthesis
NeurIPS 2025arXiv
0
citations
VQTalker: Towards Multilingual Talking Avatars Through Facial Motion Tokenization
AAAI 2025
0
citations
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
AAAI 2024arXiv
0
citations