Zheng Zhang

80
Papers
628
Total Citations

Papers (80)

Disentangled Non-local Neural Networks

ECCV 2020
366
citations

Pixel-GS Density Control with Pixel-aware Gradient for 3D Gaussian Splatting

ECCV 2024
96
citations

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

CVPR 2024
41
citations

PolaFormer: Polarity-aware Linear Attention for Vision Transformers

ICLR 2025arXiv
36
citations

Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems

AAAI 2025
28
citations

Masked Structural Growth for 2x Faster Language Model Pre-training

ICLR 2024
27
citations

Saliency-based Sequential Image Attention with Multiset Prediction

NeurIPS 2017arXiv
23
citations

Learning to Complement and to Defer to Multiple Users

ECCV 2024
7
citations

Projection Pursuit Density Ratio Estimation

ICML 2025
3
citations

Intent Oriented Contrastive Learning for Sequential Recommendation

AAAI 2025
1
citations

BiPFT: Binary Pre-trained Foundation Transformer with Low-Rank Estimation of Binarization Residual Polynomials

AAAI 2024arXiv
0
citations

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

CVPR 2024
0
citations

Adaptive Slot Attention: Object Discovery with Dynamic Slot Number

CVPR 2024
0
citations

Segment and Caption Anything

CVPR 2024
0
citations

Language-Driven Cross-Modal Classifier for Zero-Shot Multi-Label Image Recognition

ICML 2024
0
citations

GroupCover: A Secure, Efficient and Scalable Inference Framework for On-device Model Protection based on TEEs

ICML 2024
0
citations

SpikeLM: Towards General Spike-Driven Language Modeling via Elastic Bi-Spiking Mechanisms

ICML 2024
0
citations

The Application of Two-Level Attention Models in Deep Convolutional Neural Network for Fine-Grained Image Classification

CVPR 2015
0
citations

Symmetry-Based Text Line Detection in Natural Scenes

CVPR 2015
0
citations

Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis

CVPR 2016
0
citations

Multi-Oriented Text Detection With Fully Convolutional Networks

CVPR 2016
0
citations

Relation Networks for Object Detection

CVPR 2018arXiv
0
citations

Attentive Region Embedding Network for Zero-Shot Learning

CVPR 2019
0
citations

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

CVPR 2021arXiv
0
citations

Prototype-Supervised Adversarial Network for Targeted Attack of Deep Hashing

CVPR 2021arXiv
0
citations

Swin Transformer V2: Scaling Up Capacity and Resolution

CVPR 2022arXiv
0
citations

SimMIM: A Simple Framework for Masked Image Modeling

CVPR 2022arXiv
0
citations

Video Swin Transformer

CVPR 2022arXiv
0
citations

TinyMIM: An Empirical Study of Distilling MIM Pre-Trained Models

CVPR 2023arXiv
0
citations

On Data Scaling in Masked Image Modeling

CVPR 2023arXiv
0
citations

Side Adapter Network for Open-Vocabulary Semantic Segmentation

CVPR 2023arXiv
0
citations

Revealing the Dark Secrets of Masked Image Modeling

CVPR 2023arXiv
0
citations

iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-Training for Visual Recognition

CVPR 2023
0
citations

Multiple Granularity Descriptors for Fine-Grained Categorization

ICCV 2015
0
citations

Local Relation Networks for Image Recognition

ICCV 2019
0
citations

Spatial-Temporal Relation Networks for Multi-Object Tracking

ICCV 2019
0
citations

An Empirical Study of Spatial Attention Mechanisms in Deep Networks

ICCV 2019
0
citations

Learning Hierarchical Graph Neural Networks for Image Clustering

ICCV 2021arXiv
0
citations

Group-Free 3D Object Detection via Transformers

ICCV 2021arXiv
0
citations

Semantics Disentangling for Generalized Zero-Shot Learning

ICCV 2021arXiv
0
citations

Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows

ICCV 2021arXiv
0
citations

End-to-End Semi-Supervised Object Detection With Soft Teacher

ICCV 2021arXiv
0
citations

A Semantic Knowledge Complementarity based Decoupling Framework for Semi-supervised Class-imbalanced Medical Image Segmentation

CVPR 2025
0
citations

KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection

ICCV 2023arXiv
0
citations

Object-Centric Multiple Object Tracking

ICCV 2023arXiv
0
citations

Unsupervised Open-Vocabulary Object Localization in Videos

ICCV 2023arXiv
0
citations

DETR Does Not Need Multi-Scale or Locality Design

ICCV 2023
0
citations

All in Tokens: Unifying Output Space of Visual Tasks via Soft Token

ICCV 2023arXiv
0
citations

Dual Learning with Dynamic Knowledge Distillation for Partially Relevant Video Retrieval

ICCV 2023
0
citations

Improving CLIP Fine-tuning Performance

ICCV 2023
0
citations

Coarse-to-Fine Amodal Segmentation with Shape Prior

ICCV 2023arXiv
0
citations

Spatially Adaptive Inference with Stochastic Feature Sampling and Interpolation

ECCV 2020
0
citations

Negative Margin Matters: Understanding Margin in Few-shot Classification

ECCV 2020
0
citations

Region Graph Embedding Network for Zero-Shot Learning

ECCV 2020
0
citations

A Closer Look at Local Aggregation Operators in Point Cloud Analysis

ECCV 2020
0
citations

"A Simple Approach and Benchmark for 21,000-Category Object Detection"

ECCV 2022
0
citations

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model

ECCV 2022
0
citations

PSS: Progressive Sample Selection for Open-World Visual Representation Learning

ECCV 2022
0
citations

Rethinking Amodal Video Segmentation from Learning Supervised Signals with Object-centric Representation

ICCV 2023arXiv
0
citations

Optimal Transport-Guided Source-Free Adaptation for Face Anti-Spoofing

CVPR 2025
0
citations

StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth

ICCV 2025
0
citations

Portcullis: A Scalable and Verifiable Privacy Gateway for Third-Party LLM Inference

AAAI 2025
0
citations

OT-StainNet: Optimal Transport Driven Semantic Matching for Weakly Paired H&E-to-IHC Stain Transfer

AAAI 2025
0
citations

Transferable Adversarial Face Attack with Text Controlled Attribute

AAAI 2025
0
citations

Distribution-Driven Dense Retrieval: Modeling Many-to-One Query-Document Relationship

AAAI 2025
0
citations

FlightBERT++: A Non-autoregressive Multi-Horizon Flight Trajectory Prediction Framework

AAAI 2024
0
citations

Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision

AAAI 2024
0
citations

CONSIDER: Commonalities and Specialties Driven Multilingual Code Retrieval Framework

AAAI 2024
0
citations

Loss Functions for Multiset Prediction

NeurIPS 2018
0
citations

RepPoints v2: Verification Meets Regression for Object Detection

NeurIPS 2020
0
citations

Parametric Instance Classification for Unsupervised Visual Feature learning

NeurIPS 2020
0
citations

Representation Learning on Spatial Networks

NeurIPS 2021
0
citations

Bootstrap Your Object Detector via Mixed Training

NeurIPS 2021
0
citations

GRIN: Generative Relation and Intention Network for Multi-agent Trajectory Prediction

NeurIPS 2021
0
citations

Self-supervised Amodal Video Object Segmentation

NeurIPS 2022
0
citations

Could Giant Pre-trained Image Models Extract Universal Representations?

NeurIPS 2022
0
citations

Learning Enhanced Representation for Tabular Data via Neighborhood Propagation

NeurIPS 2022
0
citations

Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning

NeurIPS 2022
0
citations

Curriculum Learning for Graph Neural Networks: Which Edges Should We Learn First

NeurIPS 2023
0
citations

Evaluating Open-QA Evaluation

NeurIPS 2023
0
citations