Xiaodan Liang
23
Papers
255
Total Citations
Papers (23)
DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection
CVPR 2024
45
citations
EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
CVPR 2025arXiv
44
citations
Making Large Language Models Better Planners with Reasoning-Decision Alignment
ECCV 2024
35
citations
WISA: World simulator assistant for physics-aware text-to-video generation
NeurIPS 2025
33
citations
AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis
CVPR 2024
20
citations
MLP Can Be A Good Transformer Learner
CVPR 2024
20
citations
FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model
CVPR 2025
15
citations
HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
CVPR 2025
13
citations
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
CVPR 2025
12
citations
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving
ICCV 2025
11
citations
PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition
AAAI 2024
3
citations
S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking
ICML 2025
2
citations
Monocular 3D Hand Mesh Recovery via Dual Noise Estimation
AAAI 2024arXiv
2
citations
Affordances-Oriented Planning Using Foundation Models for Continuous Vision-Language Navigation
AAAI 2025
0
citations
RoboPearls: Editable Video Simulation for Robot Manipulation
ICCV 2025
0
citations
MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval
AAAI 2025
0
citations
3D Visibility-Aware Generalizable Neural Radiance Fields for Interacting Hands
AAAI 2024
0
citations
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
AAAI 2024
0
citations
DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder
AAAI 2025
0
citations
RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation
ICCV 2025
0
citations
A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation
ICCV 2025
0
citations
Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models
CVPR 2024
0
citations
BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving
AAAI 2025
0
citations