Xiaodan Liang

23
Papers
255
Total Citations

Papers (23)

DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection

CVPR 2024
45
citations

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

CVPR 2025arXiv
44
citations

Making Large Language Models Better Planners with Reasoning-Decision Alignment

ECCV 2024
35
citations

WISA: World simulator assistant for physics-aware text-to-video generation

NeurIPS 2025
33
citations

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis

CVPR 2024
20
citations

MLP Can Be A Good Transformer Learner

CVPR 2024
20
citations

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

CVPR 2025
15
citations

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

CVPR 2025
13
citations

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

CVPR 2025
12
citations

RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving

ICCV 2025
11
citations

PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition

AAAI 2024
3
citations

S2-Track: A Simple yet Strong Approach for End-to-End 3D Multi-Object Tracking

ICML 2025
2
citations

Monocular 3D Hand Mesh Recovery via Dual Noise Estimation

AAAI 2024arXiv
2
citations

Affordances-Oriented Planning Using Foundation Models for Continuous Vision-Language Navigation

AAAI 2025
0
citations

RoboPearls: Editable Video Simulation for Robot Manipulation

ICCV 2025
0
citations

MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval

AAAI 2025
0
citations

3D Visibility-Aware Generalizable Neural Radiance Fields for Interacting Hands

AAAI 2024
0
citations

Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model

AAAI 2024
0
citations

DreamFit: Garment-Centric Human Generation via a Lightweight Anything-Dressing Encoder

AAAI 2025
0
citations

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

ICCV 2025
0
citations

A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation

ICCV 2025
0
citations

Holistic Autonomous Driving Understanding by Bird’s-Eye-View Injected Multi-Modal Large Models

CVPR 2024
0
citations

BEV-TSR: Text-Scene Retrieval in BEV Space for Autonomous Driving

AAAI 2025
0
citations