Chen Chen
113
Papers
835
Total Citations
Papers (113)
ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
ECCV 2024
146
citations
Compositional Text-to-Image Synthesis with Attention Map Control of Diffusion Models
AAAI 2024arXiv
92
citations
Detecting, Explaining, and Mitigating Memorization in Diffusion Models
ICLR 2024
83
citations
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
NeurIPS 2025
81
citations
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
ICLR 2024
45
citations
Towards Surveillance Video-and-Language Understanding: New Dataset Baselines and Challenges
CVPR 2024
44
citations
BAMM: Bidirectional Autoregressive Motion Model
ECCV 2024
41
citations
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
ICLR 2024
36
citations
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
ICLR 2024
32
citations
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling
ICLR 2025
27
citations
Skeleton-in-Context: Unified Skeleton Sequence Modeling with In-Context Learning
CVPR 2024
26
citations
GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models
AAAI 2025
22
citations
GCNext: Towards the Unity of Graph Convolutions for Human Motion Prediction
AAAI 2024arXiv
21
citations
Adaptive FSS: A Novel Few-Shot Segmentation Framework via Prototype Enhancement
AAAI 2024arXiv
21
citations
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
20
citations
FedMef: Towards Memory-efficient Federated Dynamic Pruning
CVPR 2024
18
citations
A Simple Background Augmentation Method for Object Detection with Diffusion Model
ECCV 2024
15
citations
MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
ICCV 2025
12
citations
SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing
ICCV 2025
12
citations
Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models
CVPR 2025
6
citations
Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding
CVPR 2025
6
citations
DA-VPT: Semantic-Guided Visual Prompt Tuning for Vision Transformers
CVPR 2025
5
citations
Revisiting Graph Contrastive Learning on Anomaly Detection: A Structural Imbalance Perspective
AAAI 2025
5
citations
SimMLM: A Simple Framework for Multi-modal Learning with Missing Modality
ICCV 2025
4
citations
Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning
AAAI 2025
3
citations
SemStereo: Semantic-Constrained Stereo Matching Network for Remote Sensing
AAAI 2025
3
citations
TLB-VFI: Temporal-Aware Latent Brownian Bridge Diffusion for Video Frame Interpolation
ICCV 2025
3
citations
SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World
ICCV 2025
3
citations
BrainMAP: Learning Multiple Activation Pathways in Brain Networks
AAAI 2025
1
citations
Fusion Meets Diverse Conditions: A High-diversity Benchmark and Baseline for UAV-based Multimodal Object Detection with Condition Cues
ICCV 2025
1
citations
Out-of-Distribution Generalization on Graphs via Progressive Inference
AAAI 2025
1
citations
Real-World Anomaly Detection in Surveillance Videos
CVPR 2018arXiv
0
citations
Boosting Local Shape Matching for Dense 3D Face Correspondence
CVPR 2019
0
citations
Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction
CVPR 2020
0
citations
Multi-Scale Progressive Fusion Network for Single Image Deraining
CVPR 2020arXiv
0
citations
Learning Normal Dynamics in Videos With Meta Prototype Network
CVPR 2021arXiv
0
citations
VIGOR: Cross-View Image Geo-Localization Beyond One-to-One Retrieval
CVPR 2021arXiv
0
citations
TransGeo: Transformer Is All You Need for Cross-View Image Geo-Localization
CVPR 2022arXiv
0
citations
SPAct: Self-Supervised Privacy Preservation for Action Recognition
CVPR 2022arXiv
0
citations
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning
CVPR 2022arXiv
0
citations
PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation
CVPR 2023arXiv
0
citations
FeatER: An Efficient Network for Human Reconstruction via Feature Map-Based TransformER
CVPR 2023arXiv
0
citations
TopNet: Transformer-Based Object Placement Network for Image Compositing
CVPR 2023arXiv
0
citations
MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID
CVPR 2023arXiv
0
citations
Dynamic Graph Learning With Content-Guided Spatial-Frequency Relation Reasoning for Deepfake Detection
CVPR 2023
0
citations
TimeBalance: Temporally-Invariant and Temporally-Distinctive Video Representations for Semi-Supervised Action Recognition
CVPR 2023arXiv
0
citations
Private Image Generation With Dual-Purpose Auxiliary Classifier
CVPR 2023
0
citations
R2Former: Unified Retrieval and Reranking Transformer for Place Recognition
CVPR 2023
0
citations
POTTER: Pooling Attention Transformer for Efficient Human Mesh Recovery
CVPR 2023arXiv
0
citations
Robust Image Segmentation Using Contour-Guided Color Palettes
ICCV 2015
0
citations
Tube Convolutional Neural Network (T-CNN) for Action Detection in Videos
ICCV 2017
0
citations
Seeing Motion in the Dark
ICCV 2019
0
citations
3D Human Pose Estimation With Spatial and Temporal Transformers
ICCV 2021arXiv
0
citations
Pseudo-label Alignment for Semi-supervised Instance Segmentation
ICCV 2023arXiv
0
citations
FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning
ICCV 2023arXiv
0
citations
PGFed: Personalize Each Client's Global Objective for Federated Learning
ICCV 2023
0
citations
AlignDet: Aligning Pre-training and Fine-tuning in Object Detection
ICCV 2023arXiv
0
citations
A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
ICCV 2023arXiv
0
citations
When Do Curricula Work in Federated Learning?
ICCV 2023arXiv
0
citations
RenderIH: A Large-Scale Synthetic Dataset for 3D Interacting Hand Pose Estimation
ICCV 2023arXiv
0
citations
Source-free Domain Adaptive Human Pose Estimation
ICCV 2023arXiv
0
citations
Towards Geospatial Foundation Models via Continual Pretraining
ICCV 2023arXiv
0
citations
Reconciling Object-Level and Global-Level Objectives for Long-Tail Detection
ICCV 2023
0
citations
Multi-view Self-supervised Disentanglement for General Image Denoising
ICCV 2023arXiv
0
citations
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
ECCV 2020
0
citations
Self-supervision with Superpixels: Training Few-shot Medical Image Segmentation without Annotation
ECCV 2020
0
citations
Unstructured Feature Decoupling for Vehicle Re-identification
ECCV 2022
0
citations
Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation
ECCV 2022
0
citations
GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
ECCV 2022
0
citations
GAMa: Cross-view Video Geo-localization
ECCV 2022
0
citations
TARGET: Federated Class-Continual Learning via Exemplar-Free Distillation
ICCV 2023arXiv
0
citations
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level
CVPR 2025
0
citations
UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification
CVPR 2025
0
citations
Argus: A Compact and Versatile Foundation Model for Vision
CVPR 2025
0
citations
Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-based Action Recognition
ICCV 2025
0
citations
X2I: Seamless Integration of Multimodal Understanding into Diffusion Transformer via Attention Distillation
ICCV 2025
0
citations
MixA: A Mixed Attention approach with Stable Lightweight Linear Attention to enhance Efficiency of Vision Transformers at the Edge
ICCV 2025
0
citations
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
ICCV 2025
0
citations
TARFVAE: Efficient One-Step Generative Time Series Forecasting via TARFLOW based VAE
NeurIPS 2025
0
citations
SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation
AAAI 2025
0
citations
Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery
AAAI 2025
0
citations
GenHMR: Generative Human Mesh Recovery
AAAI 2025
0
citations
From Coarse to Fine: A Matching and Alignment Framework for Unsupervised Cross-View Geo-Localization
AAAI 2025
0
citations
ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data
AAAI 2025
0
citations
Virtual Nodes Can Help: Tackling Distribution Shifts in Federated Graph Learning
AAAI 2025
0
citations
Certified Causal Defense with Generalizable Robustness
AAAI 2025
0
citations
Towards Improved Proxy-Based Deep Metric Learning via Data-Augmented Domain Adaptation
AAAI 2024arXiv
0
citations
Decouple Content and Motion for Conditional Image-to-Video Generation
AAAI 2024
0
citations
Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection
CVPR 2024
0
citations
Multi-View Attentive Contextualization for Multi-View 3D Object Detection
CVPR 2024
0
citations
MMM: Generative Masked Motion Model
CVPR 2024
0
citations
OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
CVPR 2024
0
citations
Towards Memorization-Free Diffusion Models
CVPR 2024
0
citations
A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation
CVPR 2024
0
citations
How to Trace Latent Generative Model Generated Images without Artificial Watermark?
ICML 2024
0
citations
COALA: A Practical and Vision-Centric Federated Learning Platform
ICML 2024
0
citations
Deep Sparse Representation for Robust Image Registration
CVPR 2015
0
citations
Binary Coding for Partial Action Analysis With Limited Observation Ratios
CVPR 2017
0
citations
Cross-View Image Matching for Geo-Localization in Urban Environments
CVPR 2017arXiv
0
citations
Semantic Image Inpainting With Deep Generative Models
CVPR 2017arXiv
0
citations
Learning to See in the Dark
CVPR 2018arXiv
0
citations
GradAug: A New Regularization Method for Deep Neural Networks
NeurIPS 2020
0
citations
CalFAT: Calibrated Federated Adversarial Training with Label Skewness
NeurIPS 2022
0
citations
Nonnegative Tensor Completion via Integer Optimization
NeurIPS 2022
0
citations
Plan To Predict: Learning an Uncertainty-Foreseeing Model For Model-Based Reinforcement Learning
NeurIPS 2022
0
citations
DENSE: Data-Free One-Shot Federated Learning
NeurIPS 2022
0
citations
Graph Few-shot Learning with Task-specific Structures
NeurIPS 2022
0
citations
Hokoff: Real Game Dataset from Honor of Kings and its Offline Reinforcement Learning Benchmarks
NeurIPS 2023
0
citations
Is Heterogeneity Notorious? Taming Heterogeneity to Handle Test-Time Shift in Federated Learning
NeurIPS 2023
0
citations
A Single 2D Pose with Context is Worth Hundreds for 3D Human Pose Estimation
NeurIPS 2023
0
citations
Supported Value Regularization for Offline Reinforcement Learning
NeurIPS 2023
0
citations
Where Did I Come From? Origin Attribution of AI-Generated Images
NeurIPS 2023
0
citations
Counterfactual Conservative Q Learning for Offline Multi-agent Reinforcement Learning
NeurIPS 2023
0
citations