Boqing Gong

51
Papers
630
Total Citations

Papers (51)

Language Model Beats Diffusion - Tokenizer is key to visual generation

ICLR 2024
525
citations

Improved Dropout for Shallow and Deep Learning

NeurIPS 2016arXiv
83
citations

Distilling Vision-Language Models on Millions of Videos

CVPR 2024
20
citations

HypDAE: Hyperbolic Diffusion Autoencoders for Hierarchical Few-shot Image Generation

ICCV 2025
1
citations

BabyVLM: Data-Efficient Pretraining of VLMs Inspired by Infant Learning

ICCV 2025
1
citations

VideoPrism: A Foundational Visual Encoder for Video Understanding

ICML 2024
0
citations

Learning Attributes Equals Multi-Source Domain Generalization

CVPR 2016
0
citations

Synthesized Classifiers for Zero-Shot Learning

CVPR 2016
0
citations

Fast Zero-Shot Image Tagging

CVPR 2016
0
citations

Query-Focused Video Summarization: Dataset, Evaluation, and a Memory Network Based Approach

CVPR 2017arXiv
0
citations

Improving Facial Attribute Prediction Using Semantic Segmentation

CVPR 2017arXiv
0
citations

Geometry Guided Convolutional Neural Networks for Self-Supervised Video Representation Learning

CVPR 2018
0
citations

Deep Face Detector Adaptation Without Negative Transfer or Catastrophic Forgetting

CVPR 2018
0
citations

End-to-End Learning of Motion Representation for Video Understanding

CVPR 2018arXiv
0
citations

Large-Scale Long-Tailed Recognition in an Open World

CVPR 2019
0
citations

Not All Frames Are Equal: Weakly-Supervised Video Grounding With Contextual Similarity and Visual Clustering Losses

CVPR 2019
0
citations

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model

CVPR 2020arXiv
0
citations

Adversarial Examples Improve Image Recognition

CVPR 2020arXiv
0
citations

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective

CVPR 2020arXiv
0
citations

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

CVPR 2020arXiv
0
citations

Open Compound Domain Adaptation

CVPR 2020arXiv
0
citations

Ranking Neural Checkpoints

CVPR 2021arXiv
0
citations

Robust and Accurate Object Detection via Adversarial Learning

CVPR 2021arXiv
0
citations

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

CVPR 2021arXiv
0
citations

Adversarially Adaptive Normalization for Single Domain Generalization

CVPR 2021arXiv
0
citations

Spatiotemporal Contrastive Video Representation Learning

CVPR 2021arXiv
0
citations

MoViNets: Mobile Video Networks for Efficient Video Recognition

CVPR 2021arXiv
0
citations

Contextualized Spatio-Temporal Contrastive Learning With Self-Supervision

CVPR 2022arXiv
0
citations

On Calibrating Semantic Segmentation Models: Analyses and an Algorithm

CVPR 2023arXiv
0
citations

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

ICCV 2017arXiv
0
citations

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes

ICCV 2017arXiv
0
citations

Domain Randomization and Pyramid Consistency: Simulation-to-Real Generalization Without Accessing Target Domain Data

ICCV 2019
0
citations

A Fast and Accurate One-Stage Approach to Visual Grounding

ICCV 2019
0
citations

Constructing Self-Motivated Pyramid Curriculums for Cross-Domain Semantic Segmentation: A Non-Adversarial Approach

ICCV 2019
0
citations

A Lazy Approach to Long-Horizon Gradient-Based Meta-Learning

ICCV 2021
0
citations

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

ICCV 2021arXiv
0
citations

Unified Visual Relationship Detection with Vision and Language Models

ICCV 2023arXiv
0
citations

Improving Object Detection with Selective Self-Supervised Self-Training

ECCV 2020
0
citations

Anti-Neuron Watermarking: Protecting Personal Data against Unauthorized Neural Networks

ECCV 2022
0
citations

LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds

ECCV 2022
0
citations

Contrastive Learning for Label Efficient Semantic Segmentation

ICCV 2021arXiv
0
citations

Attention to Neural Plagiarism: Diffusion Models Can Plagiarize Your Copyrighted Images!

ICCV 2025
0
citations

VideoAds for Fast-Paced Video Understanding

ICCV 2025
0
citations

SITE: towards Spatial Intelligence Thorough Evaluation

ICCV 2025
0
citations

On Discrete Prompt Optimization for Diffusion Models

ICML 2024
0
citations

Synthesized Policies for Transfer and Adaptation across Tasks and Environments

NeurIPS 2018
0
citations

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation

NeurIPS 2021
0
citations

VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

NeurIPS 2021
0
citations

Video Timeline Modeling For News Story Understanding

NeurIPS 2023
0
citations

Module-wise Adaptive Distillation for Multimodality Foundation Models

NeurIPS 2023
0
citations

NATTACK: Learning the Distributions of Adversarial Examples for an Improved Black-Box Attack on Deep Neural Networks

ICML 2019
0
citations