Yang Wu
24
Papers
20
Total Citations
Papers (24)
Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program
ICCV 2025
10
citations
Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning
NeurIPS 2025
6
citations
Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark
ICML 2025
4
citations
Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness
ICCV 2025
0
citations
Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching
ICCV 2025
0
citations
Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model
AAAI 2024
0
citations
HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming
ICML 2024
0
citations
Saturation-Preserving Specular Reflection Separation
CVPR 2015
0
citations
Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals
CVPR 2018arXiv
0
citations
Dynamic Face Video Segmentation via Reinforcement Learning
CVPR 2020arXiv
0
citations
UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection
CVPR 2022arXiv
0
citations
Co-Salient Object Detection With Uncertainty-Aware Group Exchange-Masking
CVPR 2023
0
citations
Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction
CVPR 2023
0
citations
Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework
ICCV 2021arXiv
0
citations
Uniformity in Heterogeneity: Diving Deep Into Count Interval Partition for Crowd Counting
ICCV 2021arXiv
0
citations
Face Clustering via Graph Convolutional Networks with Confidence Edges
ICCV 2023
0
citations
Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video
ICCV 2023arXiv
0
citations
ForkGAN: Seeing into the Rainy Night
ECCV 2020
0
citations
Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking
ECCV 2020
0
citations
WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion
CVPR 2025
0
citations
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
CVPR 2025
0
citations
Event-Equalized Dense Video Captioning
CVPR 2025
0
citations
Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs
NeurIPS 2023
0
citations
CL-NeRF: Continual Learning of Neural Radiance Fields for Evolving Scene Representation
NeurIPS 2023
0
citations