Ning Zhang
7
Papers
22
Total Citations
Papers (7)
M2Doc: A Multi-Modal Fusion Approach for Document Layout Analysis
AAAI 2024
14
citations
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
CVPR 2025arXiv
7
citations
Deep Video Inverse Tone Mapping Based on Temporal Clues
CVPR 2024
1
citations
Apollo: An Exploration of Video Understanding in Large Multimodal Models
CVPR 2025
0
citations
Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation
CVPR 2025
0
citations
Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction
CVPR 2025
0
citations
Planning, Fast and Slow: Online Reinforcement Learning with Action-Free Offline Data via Multiscale Planners
ICML 2024
0
citations