Xi Wang
15
Papers
96
Total Citations
Papers (15)
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
CVPR 2025arXiv
41
citations
Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction Anticipation
CVPR 2024
22
citations
Real Appearance Modeling for More General Deepfake Detection
ECCV 2024
12
citations
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
NeurIPS 2025
8
citations
SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering
AAAI 2025
7
citations
LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model
CVPR 2025
4
citations
Scale-invariant attention
NeurIPS 2025
2
citations
Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description
ICCV 2025
0
citations
DCTMamba: Advancing JPEG Image Restoration Through Long-Sequence Modeling and Adaptive Frequency Strategy
AAAI 2025
0
citations
AKiRa: Augmentation Kit on Rays for Optical Video Generation
CVPR 2025
0
citations
WANDR: Intention-guided Human Motion Generation
CVPR 2024
0
citations
What Do You See in Vehicle? Comprehensive Vision Solution for In-Vehicle Gaze Estimation
CVPR 2024
0
citations
Long-Tail Class Incremental Learning via Independent Sub-prototype Construction
CVPR 2024
0
citations
Understanding Museum Exhibits using Vision-Language Reasoning
ICCV 2025
0
citations
Exploration-Driven Generative Interactive Environments
CVPR 2025
0
citations