Xiao Chen

12

Papers

90

Total Citations

Papers (12)

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

From One to More: Contextual Part Latents for 3D Generation

FIRM: Flexible Interactive Reflection ReMoval

Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

FOAL: Fast Online Adaptive Learning for Cardiac Motion Estimation

Robust Landmark-Based Stent Tracking in X-Ray Fluoroscopy

DynaBERT: Dynamic BERT with Adaptive Width and Depth

M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus