Zihan Wang
12
Papers
963
Total Citations
Papers (12)
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
858
citations
Re-thinking Temporal Search for Long-Form Video Understanding
CVPR 2025
36
citations
Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank
ICLR 2024
23
citations
Reducing Tool Hallucination via Reliability Alignment
ICML 2025
19
citations
g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks
CVPR 2025
8
citations
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
NeurIPS 2025arXiv
7
citations
Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge
NeurIPS 2025
7
citations
MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion
ICCV 2025
4
citations
Auxiliary Prompt Tuning of Vision-Language Models for Few-Shot Out-of-Distribution Detection
ICCV 2025
1
citations
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation
CVPR 2024
0
citations
Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition
CVPR 2024
0
citations
CogAgent: A Visual Language Model for GUI Agents
CVPR 2024
0
citations