Xiaokang Yang

91
Papers
285
Total Citations

Papers (91)

VidToMe: Video Token Merging for Zero-Shot Video Editing

CVPR 2024
89
citations

Discrete Hyper-Graph Matching

CVPR 2015
77
citations

Domain-Controlled Prompt Learning

AAAI 2024arXiv
30
citations

Domain Prompt Learning with Quaternion Networks

CVPR 2024
22
citations

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

ICCV 2025arXiv
16
citations

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

ICCV 2025arXiv
9
citations

Monocular Identity-Conditioned Facial Reflectance Reconstruction

CVPR 2024
7
citations

PostEdit: Posterior Sampling for Efficient Zero-Shot Image Editing

ICLR 2025
7
citations

Partial Label Learning with a Partner

AAAI 2024
6
citations

Disentangled World Models: Learning to Transfer Semantic Knowledge from Distracting Videos for Reinforcement Learning

ICCV 2025
4
citations

Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation

ECCV 2024
3
citations

Disentangled Clothed Avatar Generation with Layered Representation

ICCV 2025
3
citations

AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction

ICLR 2025
3
citations

Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

ICCV 2025
2
citations

Rethinking Classifier Re-Training in Long-Tailed Recognition: Label Over-Smooth Can Balance

ICLR 2025
2
citations

Latent Intuitive Physics: Learning to Transfer Hidden Physics from A 3D Video

ICLR 2024
2
citations

POMP: Physics-constrainable Motion Generative Model through Phase Manifolds

CVPR 2025
1
citations

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

ICCV 2025arXiv
1
citations

HyperET: Efficient Training in Hyperbolic Space for Multi-modal Large Language Models

NeurIPS 2025
1
citations

Long-Term Correlation Tracking

CVPR 2015
0
citations

Factors in Finetuning Deep Model for Object Detection With Long-Tail Distribution

CVPR 2016
0
citations

Progressively Parsing Interactional Objects for Fine Grained Action Detection

CVPR 2016
0
citations

Cascaded Interactional Targeting Network for Egocentric Video Analysis

CVPR 2016
0
citations

Temporal Action Localization With Pyramid of Score Distribution Features

CVPR 2016
0
citations

Video Segmentation via Multiple Granularity Analysis

CVPR 2017
0
citations

Recurrent Modeling of Interaction Context for Collective Activity Recognition

CVPR 2017
0
citations

Structure Preserving Video Prediction

CVPR 2018
0
citations

Multiple Granularity Group Interaction Prediction

CVPR 2018
0
citations

Crowd Counting via Adversarial Cross-Scale Consistency Pursuit

CVPR 2018
0
citations

Fine-Grained Video Captioning for Sports Narrative

CVPR 2018
0
citations

Learning Context Graph for Person Search

CVPR 2019
0
citations

Deep Kinematics Analysis for Monocular 3D Human Pose Estimation

CVPR 2020
0
citations

IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking

CVPR 2021arXiv
0
citations

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

CVPR 2021
0
citations

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction

CVPR 2021arXiv
0
citations

Combinatorial Learning of Graph Edit Distance via Dynamic Embedding

CVPR 2021arXiv
0
citations

Learning Invisible Markers for Hidden Codes in Offline-to-Online Photography

CVPR 2022
0
citations

Exploring Frequency Adversarial Attacks for Face Forgery Detection

CVPR 2022arXiv
0
citations

Continual Predictive Learning From Videos

CVPR 2022arXiv
0
citations

Align Representations With Base: A New Approach to Self-Supervised Learning

CVPR 2022
0
citations

End-to-End Reconstruction-Classification Learning for Face Forgery Detection

CVPR 2022
0
citations

NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds

CVPR 2023arXiv
0
citations

Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

CVPR 2023arXiv
0
citations

3D-Aware Face Swapping

CVPR 2023
0
citations

Deep Learning of Partial Graph Matching via Differentiable Top-K

CVPR 2023
0
citations

Improving Fairness in Facial Albedo Estimation via Visual-Textual Cues

CVPR 2023
0
citations

Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning Paradigm

CVPR 2023arXiv
0
citations

A Matrix Decomposition Perspective to Multiple Graph Matching

ICCV 2015
0
citations

Hierarchical Convolutional Features for Visual Tracking

ICCV 2015
0
citations

S^3-Face: SSS-Compliant Facial Reflectance Estimation via Diffusion Priors

CVPR 2025
0
citations

Variational Few-Shot Learning

ICCV 2019
0
citations

Learning Combinatorial Embedding Networks for Deep Graph Matching

ICCV 2019
0
citations

Learning To Track Objects From Unlabeled Videos

ICCV 2021arXiv
0
citations

Self-Supervised Character-to-Character Distillation for Text Recognition

ICCV 2023arXiv
0
citations

Dual Aggregation Transformer for Image Super-Resolution

ICCV 2023arXiv
0
citations

ROME: Robustifying Memory-Efficient NAS via Topology Disentanglement and Gradient Accumulation

ICCV 2023arXiv
0
citations

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

ICCV 2023arXiv
0
citations

Layered Neighborhood Expansion for Incremental Multiple Graph Matching

ECCV 2020
0
citations

Hierarchical Style-based Networks for Motion Synthesis

ECCV 2020
0
citations

Robust Tracking against Adversarial Attacks

ECCV 2020
0
citations

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering

ECCV 2020
0
citations

EAutoDet: Efficient Architecture Search for Object Detection

ECCV 2022
0
citations

Self-Supervised Learning of Visual Graph Matching

ECCV 2022
0
citations

Performance Guaranteed Network Acceleration via High-Order Residual Quantization

ICCV 2017arXiv
0
citations

OSDFace: One-Step Diffusion Model for Face Restoration

CVPR 2025
0
citations

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

CVPR 2025
0
citations

Star with Bilinear Mapping

CVPR 2025
0
citations

Domain Generalization in CLIP via Learning with Diverse Text Prompts

CVPR 2025
0
citations

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

CVPR 2025
0
citations

Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations

ICCV 2025
0
citations

QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

ICCV 2025
0
citations

A Token-level Text Image Foundation Model for Document Understanding

ICCV 2025
0
citations

HAODiff: Human-Aware One-Step Diffusion via Dual-Prompt Guidance

NeurIPS 2025
0
citations

DAWP: A framework for global observation forecasting via Data Assimilation and Weather Prediction in satellite observation space

NeurIPS 2025
0
citations

FATE: Feature-Adapted Parameter Tuning for Vision-Language Models

AAAI 2025
0
citations

SAM-PARSER: Fine-Tuning SAM Efficiently by Parameter Space Reconstruction

AAAI 2024arXiv
0
citations

LERE: Learning-Based Low-Rank Matrix Recovery with Rank Estimation

AAAI 2024
0
citations

Inter-X: Towards Versatile Human-Human Interaction Analysis

CVPR 2024
0
citations

ReGenNet: Towards Human Action-Reaction Synthesis

CVPR 2024
0
citations

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

ICML 2024
0
citations

Cross-Scene Crowd Counting via Deep Convolutional Neural Networks

CVPR 2015
0
citations

Motion Part Regularization: Improving Action Recognition via Trajectory Selection

CVPR 2015
0
citations

Video Prediction via Selective Sampling

NeurIPS 2018
0
citations

Graduated Assignment for Joint Multi-Graph Matching and Clustering with Application to Unsupervised Graph Matching Network Learning

NeurIPS 2020
0
citations

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

NeurIPS 2021
0
citations

Perceptual Attacks of No-Reference Image Quality Models with Human-in-the-Loop

NeurIPS 2022
0
citations

ZARTS: On Zero-order Optimization for Neural Architecture Search

NeurIPS 2022
0
citations

Iso-Dream: Isolating and Leveraging Noncontrollable Visual Dynamics in World Models

NeurIPS 2022
0
citations

CageNeRF: Cage-based Neural Radiance Field for Generalized 3D Deformation and Animation

NeurIPS 2022
0
citations

Adv-Attribute: Inconspicuous and Transferable Adversarial Attack on Face Recognition

NeurIPS 2022
0
citations

NeRF-IBVS: Visual Servo Based on NeRF for Visual Localization and Navigation

NeurIPS 2023
0
citations