Bo Zhang

23
Papers
723
Total Citations
2
Affiliations

Affiliations

Xiaomi;MeituanShanghai AI Laboratory

Papers (23)

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

ICCV 2025
247
citations

MLVU: Benchmarking Multi-task Long Video Understanding

CVPR 2025arXiv
93
citations

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

ICML 2025
88
citations

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

ICCV 2025
52
citations

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

ICLR 2025arXiv
48
citations

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

AAAI 2024arXiv
47
citations

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

CVPR 2025arXiv
42
citations

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

ICLR 2024
32
citations

Language-Driven Anchors for Zero-Shot Adversarial Robustness

CVPR 2024
21
citations

Shadow Generation for Composite Image Using Diffusion Model

CVPR 2024
18
citations

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

CVPR 2025arXiv
13
citations

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

AAAI 2025
9
citations

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

CVPR 2024
9
citations

ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion

ECCV 2024
2
citations

JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data

CVPR 2025
2
citations

On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm

ICML 2024
0
citations

Chimera: Improving Generalist Model with Domain-Specific Experts

ICCV 2025
0
citations

Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation

ICCV 2025
0
citations

DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving

ICCV 2025
0
citations

LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

AAAI 2025
0
citations

What Is a Good Question? Assessing Question Quality via Meta-Fact Checking

AAAI 2025
0
citations

Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models

AAAI 2024
0
citations

A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation

CVPR 2025
0
citations