Rui Zhao

71
Papers
227
Total Citations

Papers (71)

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

CVPR 2024
63
citations

GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing

NeurIPS 2025
60
citations

Sparse Global Matching for Video Frame Interpolation with Large Motion

CVPR 2024
27
citations

Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations

CVPR 2024
26
citations

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

ICCV 2025
17
citations

Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

CVPR 2024
14
citations

KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy

AAAI 2025
13
citations

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

CVPR 2025
4
citations

Re-Aligning Language to Visual Objects with an Agentic Workflow

ICLR 2025
3
citations

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

CVPR 2024
0
citations

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

CVPR 2024
0
citations

Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach

ICML 2024
0
citations

Gradient-based Visual Explanation for Transformer-based CLIP

ICML 2024
0
citations

Saliency Detection by Multi-Context Deep Learning

CVPR 2015
0
citations

Facial Expression Intensity Estimation Using Ordinal Information

CVPR 2016
0
citations

A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation

CVPR 2018
0
citations

Attention-Aware Compositional Network for Person Re-Identification

CVPR 2018arXiv
0
citations

Bilateral Ordinal Relevance Multi-Instance Regression for Facial Action Unit Intensity Estimation

CVPR 2018
0
citations

Bayesian Hierarchical Dynamic Model for Human Action Recognition

CVPR 2019
0
citations

P2SGrad: Refined Gradients for Optimizing Deep Face Models

CVPR 2019
0
citations

AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations

CVPR 2019
0
citations

Generalizing Eye Tracking With Bayesian Adversarial Learning

CVPR 2019
0
citations

COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification

CVPR 2020arXiv
0
citations

Density-Aware Feature Embedding for Face Clustering

CVPR 2020
0
citations

Bayesian Adversarial Human Motion Synthesis

CVPR 2020
0
citations

Learning to Cluster Faces via Confidence and Connectivity Estimation

CVPR 2020arXiv
0
citations

Uni6D: A Unified CNN Framework Without Projection Breakdown for 6D Pose Estimation

CVPR 2022arXiv
0
citations

Optical Flow Estimation for Spiking Camera

CVPR 2022arXiv
0
citations

Feature Erasing and Diffusion Network for Occluded Person Re-Identification

CVPR 2022arXiv
0
citations

Align Representations With Base: A New Approach to Self-Supervised Learning

CVPR 2022
0
citations

Revisiting the Transferability of Supervised Pretraining: An MLP Perspective

CVPR 2022arXiv
0
citations

UniVIP: A Unified Framework for Self-Supervised Visual Pre-Training

CVPR 2022arXiv
0
citations

Semi-Supervised Semantic Segmentation Using Unreliable Pseudo-Labels

CVPR 2022arXiv
0
citations

CORA: Adapting CLIP for Open-Vocabulary Detection With Region Prompting and Anchor Pre-Matching

CVPR 2023arXiv
0
citations

Balancing Logit Variation for Long-Tailed Semantic Segmentation

CVPR 2023
0
citations

Zero-Shot Text-to-Parameter Translation for Game Character Auto-Creation

CVPR 2023arXiv
0
citations

HumanBench: Towards General Human-Centric Perception With Projector Assisted Pretraining

CVPR 2023arXiv
0
citations

Memory-Based Neighbourhood Embedding for Visual Recognition

ICCV 2019
0
citations

Bayesian Graph Convolution LSTM for Skeleton Based Action Recognition

ICCV 2019
0
citations

Progressive Correspondence Pruning by Consensus Learning

ICCV 2021arXiv
0
citations

Human Preference Score: Better Aligning Text-to-Image Models with Human Preference

ICCV 2023arXiv
0
citations

Advancing Referring Expression Segmentation Beyond Single Image

ICCV 2023arXiv
0
citations

SparseMAE: Sparse Training Meets Masked Autoencoders

ICCV 2023
0
citations

Self-supervising Fine-grained Region Similarities for Large-scale Image Localization

ECCV 2020
0
citations

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax

ECCV 2020
0
citations

Scale-Aware Spatio-Temporal Relation Learning for Video Anomaly Detection

ECCV 2022
0
citations

Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification

ECCV 2022
0
citations

Unifying Visual Contrastive Learning for Object Recognition from a Graph Perspective

ECCV 2022
0
citations

Relative Contrastive Loss for Unsupervised Representation Learning

ECCV 2022
0
citations

Domain Invariant Masked Autoencoders for Self-Supervised Learning from Multi-Domains

ECCV 2022
0
citations

UniHCP: A Unified Model for Human-Centric Perceptions

CVPR 2023arXiv
0
citations

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

ICCV 2025
0
citations

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

ICCV 2025
0
citations

CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model

NeurIPS 2025
0
citations

RemDet: Rethinking Efficient Model Design for UAV Object Detection

AAAI 2025
0
citations

TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

AAAI 2025
0
citations

Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment

AAAI 2024
0
citations

DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing

CVPR 2024
0
citations

Self-Supervised Representation Learning from Arbitrary Scenarios

CVPR 2024
0
citations

Self-paced Contrastive Learning with Hybrid Memory for Domain Adaptive Object Re-ID

NeurIPS 2020
0
citations

MST: Masked Self-Supervised Transformer for Visual Representation

NeurIPS 2021
0
citations

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

NeurIPS 2022
0
citations

Learning from Future: A Novel Self-Training Framework for Semantic Segmentation

NeurIPS 2022
0
citations

Learning Optical Flow from Continuous Spike Streams

NeurIPS 2022
0
citations

Unsupervised Object Detection Pretraining with Joint Object Priors Generation and Detector Learning

NeurIPS 2022
0
citations

Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models

NeurIPS 2023
0
citations

Unsupervised Optical Flow Estimation with Dynamic Timing Representation for Spike Camera

NeurIPS 2023
0
citations

DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models

NeurIPS 2023
0
citations

MeGraph: Capturing Long-Range Interactions by Alternating Local and Hierarchical Aggregation on Multi-Scaled Graph Hierarchy

NeurIPS 2023
0
citations

Described Object Detection: Liberating Object Detection with Flexible Expressions

NeurIPS 2023
0
citations

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

ICML 2019
0
citations