Zhenfei Yin
6
Papers
179
Total Citations
Papers (6)
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models
ECCV 2024arXiv
92
citations
MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception
CVPR 2024
76
citations
RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints
ICCV 2025
11
citations
SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Models
CVPR 2025
0
citations
VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior
ICCV 2025
0
citations
B-VLLM: A Vision Large Language Model with Balanced Spatio-Temporal Tokens
ICCV 2025
0
citations