Ziyu Guo
10
Papers
204
Total Citations
Papers (10)
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency
ICML 2025
88
citations
Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation
AAAI 2024arXiv
58
citations
Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos
NeurIPS 2025arXiv
29
citations
No Time to Train: Empowering Non-Parametric Networks for Few-shot 3D Scene Segmentation
CVPR 2024
27
citations
StyleMotif: Multi-Modal Motion Stylization using Style-Content Cross Fusion
ICCV 2025
2
citations
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
AAAI 2025
0
citations
Let's Verify and Reinforce Image Generation Step by Step
CVPR 2025
0
citations
Less is More: Improving Motion Diffusion Models with Sparse Keyframes
ICCV 2025
0
citations
EchoTraffic: Enhancing Traffic Anomaly Understanding with Audio-Visual Insights
CVPR 2025
0
citations
LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding
AAAI 2025
0
citations