Shuo Yang

38
Papers
113
Total Citations

Papers (38)

MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation

AAAI 2025
50
citations

WorldModelBench: Judging Video Generation Models As World Models

NeurIPS 2025
31
citations

Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning

NeurIPS 2025
11
citations

HashAttention: Semantic Sparsity for Faster Inference

ICML 2025
11
citations

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

NeurIPS 2025
3
citations

CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step

NeurIPS 2025
3
citations

BOOD: Boundary-based Out-Of-Distribution Data Generation

ICML 2025
2
citations

LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching

ICCV 2025
1
citations

Neural networks on Symmetric Spaces of Noncompact Type

ICLR 2025
1
citations

Optimizing Video Object Detection via a Scale-Time Lattice

CVPR 2018arXiv
0
citations

Region Proposal by Guided Anchoring

CVPR 2019
0
citations

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

CVPR 2020arXiv
0
citations

Compatibility-Aware Heterogeneous Visual Search

CVPR 2021arXiv
0
citations

Positive-Congruent Training: Towards Regression-Free Model Updates

CVPR 2021arXiv
0
citations

Single-View 3D Object Reconstruction From Shape Priors in Memory

CVPR 2021arXiv
0
citations

CAFE: Learning To Condense Dataset by Aligning Features

CVPR 2022arXiv
0
citations

BiCro: Noisy Correspondence Rectification for Multi-Modality Data via Bi-Directional Cross-Modal Similarity Consistency

CVPR 2023arXiv
0
citations

Learning Imbalanced Data With Vision Transformers

CVPR 2023arXiv
0
citations

From Facial Parts Responses to Face Detection: A Deep Learning Approach

ICCV 2015
0
citations

FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos

ICCV 2019
0
citations

Improving Lens Flare Removal with General-Purpose Pipeline and Multiple Light Sources Recovery

ICCV 2023arXiv
0
citations

Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation

ICCV 2023
0
citations

PPR: Physically Plausible Reconstruction from Monocular Videos

ICCV 2023
0
citations

One Size Does NOT Fit All: Data-Adaptive Adversarial Training

ECCV 2022
0
citations

"PartImageNet: A Large, High-Quality Dataset of Parts"

ECCV 2022
0
citations

Towards Regression-Free Neural Networks for Diverse Compute Platforms

ECCV 2022
0
citations

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

CVPR 2025
0
citations

Video Summarization Using Denoising Diffusion Probabilistic Model

AAAI 2025
0
citations

RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting

AAAI 2025
0
citations

Revisiting Context Aggregation for Image Matting

ICML 2024
0
citations

Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary

ICML 2024
0
citations

DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection

CVPR 2015
0
citations

WIDER FACE: A Face Detection Benchmark

CVPR 2016
0
citations

Residual Attention Network for Image Classification

CVPR 2017arXiv
0
citations

Look at Boundary: A Boundary-Aware Face Alignment Algorithm

CVPR 2018arXiv
0
citations

Interaction Hard Thresholding: Consistent Sparse Quadratic Regression in Sub-quadratic Time and Space

NeurIPS 2019
0
citations

Does Preprocessing Help Training Over-parameterized Neural Networks?

NeurIPS 2021
0
citations

Toward Understanding Privileged Features Distillation in Learning-to-Rank

NeurIPS 2022
0
citations