Rui Zhao
71
Papers
227
Total Citations
Papers (71)
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
CVPR 2024
63
citations
GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing
NeurIPS 2025
60
citations
Sparse Global Matching for Video Frame Interpolation with Large Motion
CVPR 2024
27
citations
Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations
CVPR 2024
26
citations
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation
ICCV 2025
17
citations
Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning
CVPR 2024
14
citations
KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy
AAAI 2025
13
citations
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
CVPR 2025
4
citations
Re-Aligning Language to Visual Objects with an Agentic Workflow
ICLR 2025
3
citations
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model
CVPR 2024
0
citations
Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions
CVPR 2024
0
citations
Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach
ICML 2024
0
citations
Gradient-based Visual Explanation for Transformer-based CLIP
ICML 2024
0
citations
Saliency Detection by Multi-Context Deep Learning
CVPR 2015
0
citations
Facial Expression Intensity Estimation Using Ordinal Information
CVPR 2016
0
citations
A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation
CVPR 2018
0
citations
Attention-Aware Compositional Network for Person Re-Identification
CVPR 2018arXiv
0
citations
Bilateral Ordinal Relevance Multi-Instance Regression for Facial Action Unit Intensity Estimation
CVPR 2018
0
citations
Bayesian Hierarchical Dynamic Model for Human Action Recognition
CVPR 2019
0
citations
P2SGrad: Refined Gradients for Optimizing Deep Face Models
CVPR 2019
0
citations
AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations
CVPR 2019
0
citations
Generalizing Eye Tracking With Bayesian Adversarial Learning
CVPR 2019
0
citations
COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification
CVPR 2020arXiv
0
citations
Density-Aware Feature Embedding for Face Clustering
CVPR 2020
0
citations
Bayesian Adversarial Human Motion Synthesis
CVPR 2020
0
citations
Learning to Cluster Faces via Confidence and Connectivity Estimation
CVPR 2020arXiv
0
citations
Uni6D: A Unified CNN Framework Without Projection Breakdown for 6D Pose Estimation
CVPR 2022arXiv
0
citations
Optical Flow Estimation for Spiking Camera
CVPR 2022arXiv
0
citations
Feature Erasing and Diffusion Network for Occluded Person Re-Identification
CVPR 2022arXiv
0
citations
Align Representations With Base: A New Approach to Self-Supervised Learning
CVPR 2022
0
citations
Revisiting the Transferability of Supervised Pretraining: An MLP Perspective
CVPR 2022arXiv
0
citations
UniVIP: A Unified Framework for Self-Supervised Visual Pre-Training
CVPR 2022arXiv
0
citations
Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels
CVPR 2022arXiv
0
citations
CORA: Adapting CLIP for Open-Vocabulary Detection With Region Prompting and Anchor Pre-Matching
CVPR 2023arXiv
0
citations
Balancing Logit Variation for Long-Tailed Semantic Segmentation
CVPR 2023
0
citations
Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation
CVPR 2023arXiv
0
citations
HumanBench: Towards General Human-Centric Perception With Projector Assisted Pretraining
CVPR 2023arXiv
0
citations
Memory-Based Neighbourhood Embedding for Visual Recognition
ICCV 2019
0
citations
Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition
ICCV 2019
0
citations
Progressive Correspondence Pruning by Consensus Learning
ICCV 2021arXiv
0
citations
Human Preference Score: Better Aligning Text-to-Image Models with Human Preference
ICCV 2023arXiv
0
citations
Advancing Referring Expression Segmentation Beyond Single Image
ICCV 2023arXiv
0
citations
SparseMAE: Sparse Training Meets Masked Autoencoders
ICCV 2023
0
citations
Self-supervising Fine-grained Region Similarities for Large-scale Image Localization
ECCV 2020
0
citations
RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax
ECCV 2020
0
citations
Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection
ECCV 2022
0
citations
Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification
ECCV 2022
0
citations
Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective
ECCV 2022
0
citations
Relative Contrastive Loss for Unsupervised Representation Learning
ECCV 2022
0
citations
Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains
ECCV 2022
0
citations
UniHCP: A Unified Model for Human-Centric Perceptions
CVPR 2023arXiv
0
citations
ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning
ICCV 2025
0
citations
SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition
ICCV 2025
0
citations
CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model
NeurIPS 2025
0
citations
RemDet: Rethinking Efficient Model Design for UAV Object Detection
AAAI 2025
0
citations
TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment
AAAI 2025
0
citations
Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment
AAAI 2024
0
citations
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing
CVPR 2024
0
citations
Self-Supervised Representation Learning from Arbitrary Scenarios
CVPR 2024
0
citations
Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID
NeurIPS 2020
0
citations
MST: Masked Self-Supervised Transformer for Visual Representation
NeurIPS 2021
0
citations
Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
NeurIPS 2022
0
citations
Learning from Future: A Novel Self-Training Framework for Semantic Segmentation
NeurIPS 2022
0
citations
Learning Optical Flow from Continuous Spike Streams
NeurIPS 2022
0
citations
Unsupervised Object Detection Pretraining with Joint Object Priors Generation and Detector Learning
NeurIPS 2022
0
citations
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
NeurIPS 2023
0
citations
Unsupervised Optical Flow Estimation with Dynamic Timing Representation for Spike Camera
NeurIPS 2023
0
citations
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models
NeurIPS 2023
0
citations
MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy
NeurIPS 2023
0
citations
Described Object Detection: Liberating Object Detection with Flexible Expressions
NeurIPS 2023
0
citations
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
ICML 2019
0
citations