Yongdong Zhang
55
Papers
62
Total Citations
Papers (55)
DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection
CVPR 2024arXiv
36
citations
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
CVPR 2025arXiv
12
citations
Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
CVPR 2025
8
citations
AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation
ECCV 2024arXiv
5
citations
ELDER: Enhancing Lifelong Model Editing with Mixture-of-LoRA
AAAI 2025arXiv
1
citations
CLIP-Adapted Region-to-Text Learning for Generative Open-Vocabulary Semantic Segmentation
ICCV 2025
0
citations
IGD: Instructional Graphic Design with Multimodal Layer Generation
ICCV 2025
0
citations
Forensic-MoE: Exploring Comprehensive Synthetic Image Detection Traces with Mixture of Experts
ICCV 2025
0
citations
Diffusion-based Source-biased Model for Single Domain Generalized Object Detection
ICCV 2025
0
citations
Exploring Reliable Matching with Phase Enhancement for Night-time Semantic Segmentation
ECCV 2024arXiv
0
citations
Task-Adaptive Prompted Transformer for Cross-Domain Few-Shot Learning
AAAI 2024
0
citations
Bootstrapping Large Language Models for Radiology Report Generation
AAAI 2024
0
citations
RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization
CVPR 2024arXiv
0
citations
OTE: Exploring Accurate Scene Text Recognition Using One Token
CVPR 2024
0
citations
AnyScene: Customized Image Synthesis with Composited Foreground
CVPR 2024
0
citations
DEADiff: An Efficient Stylization Diffusion Model with Disentangled Representations
CVPR 2024arXiv
0
citations
Reinforcement Learning within Tree Search for Fast Macro Placement
ICML 2024
0
citations
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
ICML 2024arXiv
0
citations
A Circuit Domain Generalization Framework for Efficient Logic Synthesis in Chip Design
ICML 2024arXiv
0
citations
A Hierarchical Adaptive Multi-Task Reinforcement Learning Framework for Multiplier Circuit Design
ICML 2024
0
citations
Graph Structured Network for Image-Text Matching
CVPR 2020arXiv
0
citations
ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection
CVPR 2020arXiv
0
citations
Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning
CVPR 2020arXiv
0
citations
Multi-Modality Cross Attention Network for Image and Sentence Matching
CVPR 2020
0
citations
Self-Supervised Domain-Aware Generative Network for Generalized Zero-Shot Learning
CVPR 2020
0
citations
Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer
CVPR 2021arXiv
0
citations
Lesion-Aware Transformers for Diabetic Retinopathy Grading
CVPR 2021
0
citations
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
CVPR 2021arXiv
0
citations
Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection
CVPR 2021arXiv
0
citations
Action Unit Memory Network for Weakly Supervised Temporal Action Localization
CVPR 2021arXiv
0
citations
Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection
CVPR 2021
0
citations
Partial Class Activation Attention for Semantic Segmentation
CVPR 2022
0
citations
Motion-Modulated Temporal Fragment Alignment Network for Few-Shot Action Recognition
CVPR 2022
0
citations
Negative-Aware Attention Framework for Image-Text Matching
CVPR 2022
0
citations
Towards Accurate Image Coding: Improved Autoregressive Image Generation With Dynamic Vector Quantization
CVPR 2023arXiv
0
citations
Learning Semantic Relationship Among Instances for Image-Text Matching
CVPR 2023
0
citations
Not All Image Regions Matter: Masked Vector Quantization for Autoregressive Image Generation
CVPR 2023arXiv
0
citations
Learning Orthogonal Prototypes for Generalized Few-Shot Semantic Segmentation
CVPR 2023
0
citations
Proposal-Based Multiple Instance Learning for Weakly-Supervised Temporal Action Localization
CVPR 2023
0
citations
Dynamic Generative Targeted Attacks With Pattern Injection
CVPR 2023
0
citations
Crossing the Gap: Domain Generalization for Image Captioning
CVPR 2023
0
citations
Foreground Activation Maps for Weakly Supervised Object Localization
ICCV 2021
0
citations
Explainable Person Re-Identification With Attribute-Guided Metric Distillation
ICCV 2021arXiv
0
citations
Meta-Attack: Class-Agnostic and Model-Agnostic Physical Adversarial Attack
ICCV 2021
0
citations
From Two to One: A New Scene Text Recognizer With Visual Language Modeling Network
ICCV 2021arXiv
0
citations
Task-Aware Part Mining Network for Few-Shot Learning
ICCV 2021
0
citations
Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval
ICCV 2023
0
citations
Adaptive Template Transformer for Mitochondria Segmentation in Electron Microscopy Images
ICCV 2023
0
citations
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval
ECCV 2022
0
citations
Cross-Modality Transformer for Visible-Infrared Person Re-identification
ECCV 2022
0
citations
Detecting Tampered Scene Text in the Wild
ECCV 2022
0
citations
Hierarchical Granularity Transfer Learning
NeurIPS 2020
0
citations
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
NeurIPS 2022arXiv
0
citations
A Deep Instance Generative Framework for MILP Solvers Under Limited Data Availability
NeurIPS 2023arXiv
0
citations
MomentDiff: Generative Video Moment Retrieval from Random to Real
NeurIPS 2023arXiv
0
citations