Xiaohui Shen

53
Papers
732
Total Citations

Papers (53)

Learning Progressive Joint Propagation for Human Motion Prediction

ECCV 2020
187
citations

Matching-CNN Meets KNN: Quasi-Parametric Human Parsing

CVPR 2015
168
citations

SURGE: Surface Regularized Geometry Estimation from a Single Image

NeurIPS 2016
98
citations

Predicting Scene Parsing and Motion Dynamics in the Future

NeurIPS 2017arXiv
78
citations

MaskBit: Embedding-free Image Generation via Bit Tokens

ICLR 2025
72
citations

Regional Homogeneity: Towards Learning Transferable Universal Adversarial Perturbations Against Defenses

ECCV 2020
58
citations

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation

ICCV 2025arXiv
49
citations

COCONut: Modernizing COCO Segmentation

CVPR 2024
22
citations

Reversible Recursive Instance-Level Object Segmentation

CVPR 2016
0
citations

A Multi-Level Contextual Model For Person Recognition in Photo Albums

CVPR 2016
0
citations

Shortlist Selection With Residual-Aware Distance Estimator for K-Nearest Neighbor Search

CVPR 2016
0
citations

Automatic Content-Aware Color and Tone Stylization

CVPR 2016
0
citations

Semantic Object Parsing With Local-Global Long Short-Term Memory

CVPR 2016
0
citations

Event-Specific Image Importance

CVPR 2016
0
citations

Unconstrained Salient Object Detection via Proposal Subset Optimization

CVPR 2016
0
citations

Look Into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing

CVPR 2017arXiv
0
citations

Interpretable Structure-Evolving LSTM

CVPR 2017arXiv
0
citations

Deep Image Harmonization

CVPR 2017arXiv
0
citations

Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition

CVPR 2017arXiv
0
citations

MAttNet: Modular Attention Network for Referring Expression Comprehension

CVPR 2018arXiv
0
citations

Good View Hunting: Learning Photo Composition From Dense View Pairs

CVPR 2018
0
citations

Generative Image Inpainting With Contextual Attention

CVPR 2018arXiv
0
citations

Learning to Understand Image Blur

CVPR 2018
0
citations

Graphonomy: Universal Human Parsing via Graph Transfer Learning

CVPR 2019
0
citations

Semantic Component Decomposition for Face Attribute Manipulation

CVPR 2019
0
citations

Fashion Editing With Adversarial Parsing Learning

CVPR 2020arXiv
0
citations

SemanticStyleGAN: Learning Compositional Generative Priors for Controllable Image Synthesis and Editing

CVPR 2022arXiv
0
citations

R2Former: Unified Retrieval and Reranking Transformer for Place Recognition

CVPR 2023
0
citations

Deep Multi-Patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation

ICCV 2015
0
citations

Human Parsing With Contextualized Convolutional Neural Network

ICCV 2015
0
citations

Minimum Barrier Salient Object Detection at 80 FPS

ICCV 2015
0
citations

Joint Object and Part Segmentation Using Deep Learned Potentials

ICCV 2015
0
citations

Personalized Image Aesthetics

ICCV 2017
0
citations

FoveaNet: Perspective-Aware Urban Scene Parsing

ICCV 2017arXiv
0
citations

Recurrent Multimodal Interaction for Referring Image Segmentation

ICCV 2017arXiv
0
citations

Scene Parsing With Global Context Embedding

ICCV 2017arXiv
0
citations

D-Attn: Decomposed Attention for Large Vision-and-Language Model

ICCV 2025
0
citations

FW-GAN: Flow-Navigated Warping GAN for Video Virtual Try-On

ICCV 2019
0
citations

Free-Form Image Inpainting With Gated Convolution

ICCV 2019
0
citations

Towards Multi-Pose Guided Virtual Try-On Network

ICCV 2019
0
citations

Towards Interpretable Face Recognition

ICCV 2019
0
citations

A Unified 3D Human Motion Synthesis Model via Conditional Variational Auto-Encoder

ICCV 2021
0
citations

Video Object Detection via Object-level Temporal Aggregation

ECCV 2020
0
citations

Video Scene Parsing With Predictive Feature Learning

ICCV 2017arXiv
0
citations

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens

ICCV 2025
0
citations

Randomized Autoregressive Visual Generation

ICCV 2025
0
citations

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

CVPR 2024
0
citations

ViTamin: Designing Scalable Vision Models in the Vision-Language Era

CVPR 2024
0
citations

Towards Unified Depth and Semantic Prediction From a Single Image

CVPR 2015
0
citations

Salient Object Subitizing

CVPR 2015
0
citations

A Convolutional Neural Network Cascade for Face Detection

CVPR 2015
0
citations

Sequence-to-Segment Networks for Segment Detection

NeurIPS 2018
0
citations

Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP

NeurIPS 2023
0
citations