Xu Yang

40
Papers
425
Total Citations

Papers (40)

Learning Progressive Joint Propagation for Human Motion Prediction

ECCV 2020
187
citations

Finding It at Another Side: A Viewpoint-Adapted Matching Encoder for Change Captioning

ECCV 2020
64
citations

Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient

AAAI 2025
49
citations

Texture-Preserving Diffusion Models for High-Fidelity Virtual Try-On

CVPR 2024
37
citations

How to Configure Good In-Context Sequence for Visual Question Answering

CVPR 2024
36
citations

KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

NeurIPS 2025
23
citations

MemoNav: Working Memory Model for Visual Navigation

CVPR 2024
10
citations

Mimic In-Context Learning for Multimodal Tasks

CVPR 2025
8
citations

Unveiling the Unknown: Unleashing the Power of Unknown to Known in Open-Set Source-Free Domain Adaptation

CVPR 2024
6
citations

Building Variable-Sized Models via Learngene Pool

AAAI 2024arXiv
5
citations

VinT-6D: A Large-Scale Object-in-hand Dataset from Vision, Touch and Proprioception

ICML 2024
0
citations

Vision Transformers as Probabilistic Expansion from Learngene

ICML 2024
0
citations

One Meta-tuned Transformer is What You Need for Few-shot Learning

ICML 2024
0
citations

Auto-Encoding Scene Graphs for Image Captioning

CVPR 2019
0
citations

Multi-Scale Fusion Subspace Clustering Using Similarity Constraint

CVPR 2020
0
citations

SelfSAGCN: Self-Supervised Semantic Alignment for Graph Convolution Network

CVPR 2021
0
citations

Nearest Neighbor Matching for Deep Clustering

CVPR 2021
0
citations

Causal Attention for Vision-Language Tasks

CVPR 2021arXiv
0
citations

Siamese Contrastive Embedding Network for Compositional Zero-Shot Learning

CVPR 2022
0
citations

Not Just Selection, but Exploration: Online Class-Incremental Continual Learning via Dual View Consistency

CVPR 2022
0
citations

Show, Deconfound and Tell: Image Captioning With Causal Inference

CVPR 2022
0
citations

EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

CVPR 2022arXiv
0
citations

Learning to Collocate Neural Modules for Image Captioning

ICCV 2019
0
citations

Weakly Aligned Cross-Modal Learning for Multispectral Pedestrian Detection

ICCV 2019
0
citations

Unpaired Image Captioning via Scene Graph Alignments

ICCV 2019
0
citations

Auto-Parsing Network for Image Captioning and Visual Question Answering

ICCV 2021arXiv
0
citations

Learning Trajectory-Word Alignments for Video-Language Tasks

ICCV 2023arXiv
0
citations

Deep Spectral Clustering Using Dual Autoencoder Network

CVPR 2019
0
citations

Number it: Temporal Grounding Videos like Flipping Manga

CVPR 2025
0
citations

Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation

CVPR 2025
0
citations

Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens

CVPR 2025
0
citations

Democratizing High-Fidelity Co-Speech Gesture Video Generation

ICCV 2025
0
citations

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

AAAI 2025
0
citations

Inheriting Generalized Learngene for Efficient Knowledge Transfer across Multiple Tasks

AAAI 2025
0
citations

Transformer as Linear Expansion of Learngene

AAAI 2024
0
citations

A Versatile Framework for Continual Test-Time Domain Adaptation: Balancing Discriminability and Generalizability

CVPR 2024
0
citations

Long-Tail Class Incremental Learning via Independent Sub-prototype Construction

CVPR 2024
0
citations

Adversarial Learning for Robust Deep Clustering

NeurIPS 2020
0
citations

Exploring Diverse In-Context Configurations for Image Captioning

NeurIPS 2023
0
citations

Learning From Biased Soft Labels

NeurIPS 2023
0
citations