Zihan Wang
17
Papers
963
Total Citations
Papers (17)
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
CVPR 2025
858
citations
Re-thinking Temporal Search for Long-Form Video Understanding
CVPR 2025
36
citations
Implicit bias of SGD in $L_2$-regularized linear DNNs: One-way jumps from high to low rank
ICLR 2024
23
citations
Reducing Tool Hallucination via Reliability Alignment
ICML 2025
19
citations
g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks
CVPR 2025
8
citations
Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge
NeurIPS 2025
7
citations
Variational Regularized Unbalanced Optimal Transport: Single Network, Least Action
NeurIPS 2025arXiv
7
citations
MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion
ICCV 2025
4
citations
Auxiliary Prompt Tuning of Vision-Language Models for Few-Shot Out-of-Distribution Detection
ICCV 2025
1
citations
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation
CVPR 2024
0
citations
Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition
CVPR 2024
0
citations
CogAgent: A Visual Language Model for GUI Agents
CVPR 2024
0
citations
KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation
CVPR 2023arXiv
0
citations
PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
CVPR 2023arXiv
0
citations
GridMM: Grid Memory Map for Vision-and-Language Navigation
ICCV 2023arXiv
0
citations
M$^4$I: Multi-modal Models Membership Inference
NeurIPS 2022
0
citations
Breadcrumbs to the Goal: Goal-Conditioned Exploration from Human-in-the-Loop Feedback
NeurIPS 2023
0
citations