Rui Zhao

21

Papers

227

Total Citations

Papers (21)

VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence

GoT: Unleashing Reasoning Capability of MLLM for Visual Generation and Editing

Sparse Global Matching for Video Frame Interpolation with Large Motion

Boosting Spike Camera Image Reconstruction from a Perspective of Dealing with Spike Fluctuations

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Re-Aligning Language to Visual Objects with an Agentic Workflow

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Sequential Asynchronous Action Coordination in Multi-Agent Systems: A Stackelberg Decision Transformer Approach

Gradient-based Visual Explanation for Transformer-based CLIP

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

SAMPLE: Semantic Alignment through Temporal-Adaptive Multimodal Prompt Learning for Event-Based Open-Vocabulary Action Recognition

CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model

NeurIPS 2025arXiv

RemDet: Rethinking Efficient Model Design for UAV Object Detection

TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

Conditional Variational Autoencoder for Sign Language Translation with Cross-Modal Alignment

DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing

Self-Supervised Representation Learning from Arbitrary Scenarios