Boqiang Zhang
5
Papers
45
Total Citations
Papers (5)
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025
40
citations
CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness
NeurIPS 2025
3
citations
SynTab-LLaVA: Enhancing Multimodal Table Understanding with Decoupled Synthesis
CVPR 2025
2
citations
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
CVPR 2025
0
citations
Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing
CVPR 2024
0
citations