Siyuan Huang
27
Papers
364
Total Citations
Papers (27)
GUIOdyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices
ICCV 2025
96
citations
Move as You Say Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
CVPR 2024
78
citations
EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
NeurIPS 2025
34
citations
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
ICLR 2025
26
citations
Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation
ICCV 2025arXiv
24
citations
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
ECCV 2024
22
citations
Decompositional Neural Scene Reconstruction with Generative Diffusion Prior
CVPR 2025
18
citations
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
CVPR 2025
17
citations
Neural-Symbolic Recursive Machine for Systematic Generalization
ICLR 2024
14
citations
TACO: Taming Diffusion for in-the-wild Video Amodal Completion
ICCV 2025
9
citations
SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
NeurIPS 2025
8
citations
Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing
ICCV 2025
7
citations
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
CVPR 2025
6
citations
Object-Centric Prompt-Driven Vision-Language-Action Model for Robotic Manipulation
CVPR 2025arXiv
4
citations
PrimHOI: Compositional Human-Object Interaction via Reusable Primitives
ICCV 2025
1
citations
Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding
CVPR 2025
0
citations
Scaling Up Dynamic Human-Scene Interaction Modeling
CVPR 2024
0
citations
PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI
CVPR 2024
0
citations
An Embodied Generalist Agent in 3D World
ICML 2024
0
citations
ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning
CVPR 2025
0
citations
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
ICML 2024
0
citations
GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
CVPR 2025
0
citations
Dynamic Motion Blending for Versatile Motion Editing
CVPR 2025
0
citations
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
ICCV 2025
0
citations
METASCENES: Towards Automated Replica Creation for Real-world 3D Scans
CVPR 2025
0
citations
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
CVPR 2025
0
citations
AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents
CVPR 2024
0
citations