Xiao Chen

8

Papers

92

Total Citations

Papers (8)

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

From One to More: Contextual Part Latents for 3D Generation

FIRM: Flexible Interactive Reflection ReMoval

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Label-Efficient Data Augmentation with Video Diffusion Models for Guidewire Segmentation in Cardiac Fluoroscopy

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene