Shuo Yang
38
Papers
113
Total Citations
Papers (38)
MENTOR: Multi-level Self-supervised Learning for Multimodal Recommendation
AAAI 2025
50
citations
WorldModelBench: Judging Video Generation Models As World Models
NeurIPS 2025
31
citations
Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$ Pruning
NeurIPS 2025
11
citations
HashAttention: Semantic Sparsity for Faster Inference
ICML 2025
11
citations
L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models
NeurIPS 2025
3
citations
CoT-lized Diffusion: Let's Reinforce T2I Generation Step-by-step
NeurIPS 2025
3
citations
BOOD: Boundary-based Out-Of-Distribution Data Generation
ICML 2025
2
citations
LLM-enhanced Action-aware Multi-modal Prompt Tuning for Image-Text Matching
ICCV 2025
1
citations
Neural networks on Symmetric Spaces of Noncompact Type
ICLR 2025
1
citations
Optimizing Video Object Detection via a Scale-Time Lattice
CVPR 2018arXiv
0
citations
Region Proposal by Guided Anchoring
CVPR 2019
0
citations
High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification
CVPR 2020arXiv
0
citations
Compatibility-Aware Heterogeneous Visual Search
CVPR 2021arXiv
0
citations
Positive-Congruent Training: Towards Regression-Free Model Updates
CVPR 2021arXiv
0
citations
Single-View 3D Object Reconstruction From Shape Priors in Memory
CVPR 2021arXiv
0
citations
CAFE: Learning To Condense Dataset by Aligning Features
CVPR 2022arXiv
0
citations
BiCro: Noisy Correspondence Rectification for Multi-Modality Data via Bi-Directional Cross-Modal Similarity Consistency
CVPR 2023arXiv
0
citations
Learning Imbalanced Data With Vision Transformers
CVPR 2023arXiv
0
citations
From Facial Parts Responses to Face Detection: A Deep Learning Approach
ICCV 2015
0
citations
FAB: A Robust Facial Landmark Detection Framework for Motion-Blurred Videos
ICCV 2019
0
citations
Improving Lens Flare Removal with General-Purpose Pipeline and Multiple Light Sources Recovery
ICCV 2023arXiv
0
citations
Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation
ICCV 2023
0
citations
PPR: Physically Plausible Reconstruction from Monocular Videos
ICCV 2023
0
citations
One Size Does NOT Fit All: Data-Adaptive Adversarial Training
ECCV 2022
0
citations
"PartImageNet: A Large, High-Quality Dataset of Parts"
ECCV 2022
0
citations
Towards Regression-Free Neural Networks for Diverse Compute Platforms
ECCV 2022
0
citations
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
CVPR 2025
0
citations
Video Summarization Using Denoising Diffusion Probabilistic Model
AAAI 2025
0
citations
RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting
AAAI 2025
0
citations
Revisiting Context Aggregation for Image Matting
ICML 2024
0
citations
Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary
ICML 2024
0
citations
DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection
CVPR 2015
0
citations
WIDER FACE: A Face Detection Benchmark
CVPR 2016
0
citations
Residual Attention Network for Image Classification
CVPR 2017arXiv
0
citations
Look at Boundary: A Boundary-Aware Face Alignment Algorithm
CVPR 2018arXiv
0
citations
Interaction Hard Thresholding: Consistent Sparse Quadratic Regression in Sub-quadratic Time and Space
NeurIPS 2019
0
citations
Does Preprocessing Help Training Over-parameterized Neural Networks?
NeurIPS 2021
0
citations
Toward Understanding Privileged Features Distillation in Learning-to-Rank
NeurIPS 2022
0
citations