Xiu Li

54
Papers
1,000
Total Citations

Papers (54)

Disentangled Non-local Neural Networks

ECCV 2020
366
citations

Follow Your Pose: Pose-Guided Text-to-Video Generation Using Pose-Free Videos

AAAI 2024arXiv
276
citations

Taming Rectified Flow for Inversion and Editing

ICML 2025
110
citations

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives

CVPR 2024
55
citations

MultiBooth: Towards Generating All Your Concepts in an Image from Text

AAAI 2025
46
citations

Dora: Sampling and Benchmarking for 3D Shape Variational Auto-Encoders

CVPR 2025
45
citations

Densely Connected Parameter-Efficient Tuning for Referring Image Segmentation

AAAI 2025
20
citations

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

CVPR 2025
20
citations

MagicArticulate: Make Your 3D Models Articulation-Ready

CVPR 2025
16
citations

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

CVPR 2025
12
citations

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

ICCV 2025
10
citations

SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning

NeurIPS 2025
8
citations

AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward

CVPR 2025arXiv
5
citations

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

CVPR 2025
4
citations

InterSyn: Interleaved Learning for Dynamic Motion Synthesis in the Wild

ICCV 2025
2
citations

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

NeurIPS 2025
2
citations

FAFA: Frequency-Aware Flow-Aided Self-Supervision for Underwater Object Pose Estimation

ECCV 2024
2
citations

A Plug-and-Play Physical Motion Restoration Approach for In-the-Wild High-Difficulty Motions

ICCV 2025
1
citations

A Self-Boosting Framework for Automated Radiographic Report Generation

CVPR 2021
0
citations

FLAG3D: A 3D Fitness Activity Dataset With Language Instruction

CVPR 2023arXiv
0
citations

Camouflaged Object Detection With Feature Decomposition and Edge Reconstruction

CVPR 2023
0
citations

Neighborhood Preserving Hashing for Scalable Video Retrieval

ICCV 2019
0
citations

Frequency-Aware Spatiotemporal Transformers for Video Inpainting Detection

ICCV 2021
0
citations

Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution

ICCV 2021arXiv
0
citations

Degradation-Resistant Unfolding Network for Heterogeneous Image Fusion

ICCV 2023
0
citations

FineDance: A Fine-grained Choreography Dataset for 3D Full Body Dance Generation

ICCV 2023arXiv
0
citations

BoxSnake: Polygonal Instance Segmentation with Box Supervision

ICCV 2023arXiv
0
citations

Neural Capture of Animatable 3D Human from Monocular Video

ECCV 2022
0
citations

ScalableViT: Rethinking the Context-Oriented Generalization of Vision Transformer

ECCV 2022
0
citations

4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras

CVPR 2020arXiv
0
citations

MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

CVPR 2025
0
citations

Hunyuan-Portrait: Implicit Condition Control for Enhanced Portrait Animation

CVPR 2025
0
citations

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

ICCV 2025
0
citations

REPARO: Compositional 3D Assets Generation with Differentiable 3D Layout Alignment

ICCV 2025
0
citations

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

AAAI 2025
0
citations

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

AAAI 2024arXiv
0
citations

Cross-Modal Match for Language Conditioned 3D Object Grounding

AAAI 2024
0
citations

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

CVPR 2024
0
citations

Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

CVPR 2024
0
citations

Cross-Domain Policy Adaptation by Capturing Representation Mismatch

ICML 2024
0
citations

Exploration and Anti-Exploration with Distributional Random Network Distillation

ICML 2024
0
citations

PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation

ICML 2024
0
citations

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

ICML 2024
0
citations

Joint Training of Cascaded CNN for Face Detection

CVPR 2016
0
citations

Scale-Aware Face Detection

CVPR 2017arXiv
0
citations

Structure From Recurrent Motion: From Rigidity to Recurrency

CVPR 2018arXiv
0
citations

Self-Supervised Video Hashing via Bidirectional Transformers

CVPR 2021
0
citations

Mildly Conservative Q-Learning for Offline Reinforcement Learning

NeurIPS 2022
0
citations

OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression

NeurIPS 2022
0
citations

Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination

NeurIPS 2022
0
citations

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

NeurIPS 2023
0
citations

Weakly-Supervised Concealed Object Segmentation with SAM-based Pseudo Labeling and Multi-scale Feature Grouping

NeurIPS 2023
0
citations

MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy

NeurIPS 2023
0
citations

GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction

NeurIPS 2023
0
citations