Bo Zhang

74

Papers

1,436

Total Citations

2

Affiliations

Affiliations

Xiaomi;MeituanShanghai AI Laboratory

Papers (74)

Triple Generative Adversarial Nets

NeurIPS 2017arXiv

Fair DARTS: Eliminating Unfair Advantages in Differentiable Architecture Search

R1-Onevision: Advancing Generalized Multimodal Reasoning through Cross-Modal Formalization

MLVU: Benchmarking Multi-task Long Video Understanding

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

LogFormer: A Pre-train and Tuning Pipeline for Log Anomaly Detection

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

Language-Driven Anchors for Zero-Shot Adversarial Robustness

Shadow Generation for Composite Image Using Diffusion Model

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

ComFusion: Enhancing Personalized Generation by Instance-Scene Compositing and Fusion

JiSAM: Alleviate Labeling Burden and Corner Case Problems in Autonomous Driving via Minimal Real-World Data

Bringing Old Photos Back to Life

Cross-Domain Correspondence Learning for Exemplar-Based Image Translation

MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes

Prototypical Pseudo Label Denoising and Target Structure Learning for Domain Adaptive Semantic Segmentation

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion

StyleSwin: Transformer-Based GAN for High-Resolution Image Generation

Adversarial Texture for Fooling Person Detectors in the Physical World

Bringing Old Films Back to Life

Vector Quantized Diffusion Model for Text-to-Image Synthesis

Delving Into Shape-Aware Zero-Shot Semantic Segmentation

Paint by Example: Exemplar-Based Image Editing With Diffusion Models

RODIN: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

MetaPortrait: Identity-Preserving Talking Head Generation With Fast Personalized Adaptation

Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection

Bi3D: Bi-Domain Active Learning for Cross-Domain 3D Object Detection

Generative Diffusion Prior for Unified Image Restoration and Enhancement

Physically Realizable Natural-Looking Clothing Textures Evade Person Detectors via 3D Modeling

Image Cropping With Spatial-Aware Feature and Rank Consistency

RIDE: Reversal Invariant Descriptor Enhancement

FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search

Let's See Clearly: Contaminant Artifact Removal for Moving Cameras

Make-It-3D: High-fidelity 3D Creation from A Single Image with Diffusion Prior

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

MixPath: A Unified Approach for One-shot Neural Architecture Search

Foreground Object Search by Distilling Composite Image Feature

UMC: A Unified Bandwidth-efficient and Multi-resolution based Collaborative Perception Framework

Fine-grained Visible Watermark Removal

Training Interpretable Convolutional Neural Networks by Differentiating Class-specific Filters

Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation

Human-Centric Image Cropping with Partition-Aware and Content-Preserving Features

Real-Time Neural Character Rendering with Pose-Guided Multiplane Images

Max-Margin Deep Generative Models

Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling

DriveX: Omni Scene Modeling for Learning Generalizable World Knowledge in Autonomous Driving

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations

Chimera: Improving Generalist Model with Domain-Specific Experts

Temporal Overlapping Prediction: A Self-supervised Pre-training Method for LiDAR Moving Object Segmentation

A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation

LiON: Learning Point-Wise Abstaining Penalty for LiDAR Outlier DetectioN Using Diverse Synthetic Data

What Is a Good Question? Assessing Question Quality via Meta-Fact Checking

Norm Tweaking: High-Performance Low-Bit Quantization of Large Language Models

On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm

Improving Interpretability of Deep Neural Networks With Semantic Information

Textbook Question Answering Under Instructor Guidance With Memory Networks

Smooth Neighbors on Teacher Graphs for Semi-Supervised Learning

Interpret Neural Networks by Identifying Critical Data Routing Paths

Blind Geometric Distortion Correction on Images Through Deep Learning

Deep Exemplar-Based Video Colorization

Semi-crowdsourced Clustering with Deep Generative Models

DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning

Graphical Generative Adversarial Networks

Multi-objects Generation with Amortized Structural Regularization

Bi-level Score Matching for Learning Energy-based Latent Variable Models

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset

Learning to Generate with Memory

Message Passing Stein Variational Gradient Descent