Xiu Li

30
Papers
629
Total Citations

Papers (30)

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos

AAAI 2024arXiv
276
citations

Taming Rectified Flow for Inversion and Editing

ICML 2025arXiv
110
citations

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

CVPR 2024arXiv
55
citations

MultiBooth: Towards Generating All Your Concepts in an Image from Text

AAAI 2025arXiv
46
citations

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

CVPR 2025arXiv
45
citations

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

CVPR 2025
20
citations

Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation

AAAI 2025arXiv
20
citations

MagicArticulate: Make Your 3D Models Articulation-Ready

CVPR 2025
16
citations

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

CVPR 2025arXiv
12
citations

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

ICCV 2025arXiv
10
citations

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning

NeurIPS 2025arXiv
8
citations

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

CVPR 2025arXiv
4
citations

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

NeurIPS 2025arXiv
2
citations

InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild

ICCV 2025arXiv
2
citations

FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation

ECCV 2024arXiv
2
citations

A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions

ICCV 2025arXiv
1
citations

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

ICCV 2025
0
citations

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

ICCV 2025
0
citations

Hunyuan-Portrait: Implicit Condition Control for Enhanced Portrait Animation

CVPR 2025
0
citations

MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

CVPR 2025arXiv
0
citations

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

AAAI 2024arXiv
0
citations

Cross-Modal Match for Language Conditioned 3D Object Grounding

AAAI 2024
0
citations

Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

CVPR 2024arXiv
0
citations

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

CVPR 2025arXiv
0
citations

PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation

ICML 2024
0
citations

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

ICML 2024
0
citations

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

AAAI 2025
0
citations

Cross-Domain Policy Adaptation by Capturing Representation Mismatch

ICML 2024
0
citations

Exploration and Anti-Exploration with Distributional Random Network Distillation

ICML 2024
0
citations

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

CVPR 2024
0
citations