Yue Cao

41
Papers
375
Total Citations

Papers (41)

Disentangled Non-local Neural Networks

ECCV 2020
366
citations

SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

CVPR 2025
9
citations

CapsFusion: Rethinking Image-Text Data at Scale

CVPR 2024
0
citations

Deep Visual-Semantic Quantization for Efficient Image Retrieval

CVPR 2017
0
citations

Deep Cauchy Hashing for Hamming Space Retrieval

CVPR 2018
0
citations

HashGAN: Deep Learning to Hash With Pair Conditional Wasserstein GAN

CVPR 2018
0
citations

Memory Enhanced Global-Local Aggregation for Video Object Detection

CVPR 2020arXiv
0
citations

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

CVPR 2021arXiv
0
citations

Cross-Iteration Batch Normalization

CVPR 2021arXiv
0
citations

Learning Physics-Based Full-Body Human Reaching and Grasping from Brief Walking References

CVPR 2025
0
citations

SimMIM: A Simple Framework for Masked Image Modeling

CVPR 2022arXiv
0
citations

Incorporating Semi-Supervised and Positive-Unlabeled Learning for Boosting Full Reference Image Quality Assessment

CVPR 2022arXiv
0
citations

Correlation-Aware Deep Tracking

CVPR 2022arXiv
0
citations

Video Swin Transformer

CVPR 2022arXiv
0
citations

On Data Scaling in Masked Image Modeling

CVPR 2023arXiv
0
citations

All Are Worth Words: A ViT Backbone for Diffusion Models

CVPR 2023arXiv
0
citations

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

CVPR 2023arXiv
0
citations

Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography

CVPR 2023
0
citations

Revealing the Dark Secrets of Masked Image Modeling

CVPR 2023arXiv
0
citations

Images Speak in Images: A Generalist Painter for In-Context Visual Learning

CVPR 2023arXiv
0
citations

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition

CVPR 2023
0
citations

Spatial-Temporal Relation Networks for Multi-Object Tracking

ICCV 2019
0
citations

Maximum-Margin Hamming Hashing

ICCV 2019
0
citations

Group-Free 3D Object Detection via Transformers

ICCV 2021arXiv
0
citations

Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

ICCV 2021arXiv
0
citations

SegGPT: Towards Segmenting Everything in Context

ICCV 2023
0
citations

Deep Incubation: Training Large Models by Divide-and-Conquering

ICCV 2023arXiv
0
citations

Improving CLIP Fine-tuning Performance

ICCV 2023
0
citations

Unpaired Learning of Deep Image Denoising

ECCV 2020
0
citations

Negative Margin Matters: Understanding Margin in Few-shot Classification

ECCV 2020
0
citations

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

ECCV 2020
0
citations

"A Simple Approach and Benchmark for 21,000-Category Object Detection"

ECCV 2022
0
citations

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model

ECCV 2022
0
citations

Swin Transformer V2: Scaling Up Capacity and Resolution

CVPR 2022arXiv
0
citations

DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches

NeurIPS 2025arXiv
0
citations

Learning to Optimize in Swarms

NeurIPS 2019
0
citations

RepPoints v2: Verification Meets Regression for Object Detection

NeurIPS 2020
0
citations

Parametric Instance Classification for Unsupervised Visual Feature learning

NeurIPS 2020
0
citations

Bootstrap Your Object Detector via Mixed Training

NeurIPS 2021
0
citations

Could Giant Pre-trained Image Models Extract Universal Representations?

NeurIPS 2022
0
citations

Learning Transferable Features with Deep Adaptation Networks

ICML 2015
0
citations