Dongdong Chen
70
Papers
92
Total Citations
Papers (70)
OmniViD: A Generative Framework for Universal Video Understanding
CVPR 2024
29
citations
Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation
ECCV 2024arXiv
17
citations
VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
ICCV 2025
15
citations
SmartEraser: Remove Anything from Images using Masked-Region Guidance
CVPR 2025
12
citations
FreeFlux: Understanding and Exploiting Layer-Specific Roles in RoPE-Based MMDiT for Versatile Image Editing
ICCV 2025
12
citations
UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery
CVPR 2025
3
citations
Olympus: A Universal Task Router for Computer Vision Tasks
CVPR 2025
3
citations
Exploring Invariance in Images through One-way Wave Equations
ICML 2025
1
citations
Bringing Old Photos Back to Life
CVPR 2020arXiv
0
citations
Robust Superpixel-Guided Attentional Adversarial Attack
CVPR 2020
0
citations
Dynamic Convolution: Attention Over Convolution Kernels
CVPR 2020arXiv
0
citations
Self-Robust 3D Point Recognition via Gather-Vector Guidance
CVPR 2020
0
citations
Density-Aware Graph for Deep Semi-Supervised Visual Recognition
CVPR 2020arXiv
0
citations
Unsupervised Pre-Training for Person Re-Identification
CVPR 2021arXiv
0
citations
Diverse Semantic Image Synthesis via Probability Distribution Modeling
CVPR 2021arXiv
0
citations
Dynamic Head: Unifying Object Detection Heads With Attentions
CVPR 2021arXiv
0
citations
Improved Image Matting via Real-Time User Clicks and Uncertainty Estimation
CVPR 2021arXiv
0
citations
Multi-Attentional Deepfake Detection
CVPR 2021arXiv
0
citations
Mobile-Former: Bridging MobileNet and Transformer
CVPR 2022
0
citations
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields
CVPR 2022
0
citations
CSWin Transformer: A General Vision Transformer Backbone With Cross-Shaped Windows
CVPR 2022arXiv
0
citations
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
CVPR 2022arXiv
0
citations
Large-Scale Pre-Training for Person Re-Identification With Noisy Labels
CVPR 2022arXiv
0
citations
BEVT: BERT Pretraining of Video Transformers
CVPR 2022arXiv
0
citations
Shape-Invariant 3D Adversarial Point Clouds
CVPR 2022arXiv
0
citations
HairCLIP: Design Your Hair by Text and Reference Image
CVPR 2022arXiv
0
citations
Bringing Old Films Back to Life
CVPR 2022arXiv
0
citations
Robust Equivariant Imaging: A Fully Unsupervised Framework for Learning To Image From Noisy and Partial Measurements
CVPR 2022arXiv
0
citations
General Facial Representation Learning in a Visual-Linguistic Manner
CVPR 2022arXiv
0
citations
Vector Quantized Diffusion Model for Text-to-Image Synthesis
CVPR 2022arXiv
0
citations
Protecting Celebrities From DeepFake With Identity Consistency Transformer
CVPR 2022arXiv
0
citations
Diversity-Aware Meta Visual Prompting
CVPR 2023arXiv
0
citations
Look Before You Match: Instance Understanding Matters in Video Object Segmentation
CVPR 2023arXiv
0
citations
Masked Video Distillation: Rethinking Masked Feature Modeling for Self-Supervised Video Representation Learning
CVPR 2023arXiv
0
citations
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
CVPR 2023arXiv
0
citations
Streaming Video Model
CVPR 2023arXiv
0
citations
Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles
CVPR 2023arXiv
0
citations
Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
CVPR 2023arXiv
0
citations
Coherent Online Video Style Transfer
ICCV 2017arXiv
0
citations
Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once
ICCV 2019
0
citations
Learning With Noisy Labels for Robust Point Cloud Segmentation
ICCV 2021arXiv
0
citations
High-Fidelity Pluralistic Image Completion With Transformers
ICCV 2021arXiv
0
citations
Equivariant Imaging: Learning Beyond the Range Space
ICCV 2021arXiv
0
citations
MicroNet: Improving Image Recognition With Extremely Low FLOPs
ICCV 2021arXiv
0
citations
Improve Unsupervised Pretraining for Few-Label Transfer
ICCV 2021arXiv
0
citations
Improving Adversarial Robustness of Masked Autoencoders via Test-time Frequency-domain Prompting
ICCV 2023arXiv
0
citations
HairCLIPv2: Unifying Hair Editing via Proxy Feature Blending
ICCV 2023
0
citations
AvatarCraft: Transforming Text into Neural Human Avatars with Parameterized Shape and Pose Control
ICCV 2023arXiv
0
citations
Dynamic ReLU
ECCV 2020
0
citations
DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search
ECCV 2020
0
citations
Deep Decomposition Learning for Inverse Imaging Problems
ECCV 2020
0
citations
Should All Proposals Be Treated Equally in Object Detection?
ECCV 2022
0
citations
Bootstrapped Masked Autoencoders for Vision BERT Pretraining
ECCV 2022
0
citations
LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks
CVPR 2020
0
citations
Show and Segment: Universal Medical Image Segmentation via In-Context Learning
CVPR 2025
0
citations
I2V3D: Controllable Image-to-video Generation with 3D Guidance
ICCV 2025
0
citations
Equivariant Multi-Modality Image Fusion
CVPR 2024
0
citations
Towards More Unified In-context Visual Understanding
CVPR 2024
0
citations
Image Fusion via Vision-Language Model
ICML 2024
0
citations
StyleBank: An Explicit Representation for Neural Image Style Transfer
CVPR 2017arXiv
0
citations
Stereoscopic Neural Style Transfer
CVPR 2018arXiv
0
citations
Transductive Zero-Shot Learning with Visual Structure Constraint
NeurIPS 2019
0
citations
GreedyFool: Distortion-Aware Sparse Adversarial Attack
NeurIPS 2020
0
citations
Passport-aware Normalization for Deep Model Protection
NeurIPS 2020
0
citations
Stronger NAS with Weaker Predictors
NeurIPS 2021
0
citations
Unsupervised Learning From Incomplete Measurements for Inverse Problems
NeurIPS 2022
0
citations
OmniVL: One Foundation Model for Image-Language and Video-Language Tasks
NeurIPS 2022
0
citations
REVIVE: Regional Visual Representation Matters in Knowledge-Based Visual Question Answering
NeurIPS 2022
0
citations
Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models
NeurIPS 2023
0
citations
Learning from Rich Semantics and Coarse Locations for Long-tailed Object Detection
NeurIPS 2023
0
citations