Yang Wu

24
Papers
20
Total Citations

Papers (24)

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

ICCV 2025
10
citations

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

NeurIPS 2025
6
citations

Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark

ICML 2025
4
citations

Mastering Collaborative Multi-modal Data Selection: A Focus on Informativeness, Uniqueness, and Representativeness

ICCV 2025
0
citations

Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching

ICCV 2025
0
citations

Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model

AAAI 2024
0
citations

HGCN2SP: Hierarchical Graph Convolutional Network for Two-Stage Stochastic Programming

ICML 2024
0
citations

Saturation-Preserving Specular Reflection Separation

CVPR 2015
0
citations

Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals

CVPR 2018arXiv
0
citations

Dynamic Face Video Segmentation via Reinforcement Learning

CVPR 2020arXiv
0
citations

UMT: Unified Multi-Modal Transformers for Joint Video Moment Retrieval and Highlight Detection

CVPR 2022arXiv
0
citations

Co-Salient Object Detection With Uncertainty-Aware Group Exchange-Masking

CVPR 2023
0
citations

Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction

CVPR 2023
0
citations

Rethinking Counting and Localization in Crowds: A Purely Point-Based Framework

ICCV 2021arXiv
0
citations

Uniformity in Heterogeneity: Diving Deep Into Count Interval Partition for Crowd Counting

ICCV 2021arXiv
0
citations

Face Clustering via Graph Convolutional Networks with Confidence Edges

ICCV 2023
0
citations

Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video

ICCV 2023arXiv
0
citations

ForkGAN: Seeing into the Rainy Night

ECCV 2020
0
citations

Chained-Tracker: Chaining Paired Attentive Regression Results for End-to-End Joint Multiple-Object Detection and Tracking

ECCV 2020
0
citations

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

CVPR 2025
0
citations

AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea

CVPR 2025
0
citations

Event-Equalized Dense Video Captioning

CVPR 2025
0
citations

Act As You Wish: Fine-Grained Control of Motion Diffusion Model with Hierarchical Semantic Graphs

NeurIPS 2023
0
citations

CL-NeRF: Continual Learning of Neural Radiance Fields for Evolving Scene Representation

NeurIPS 2023
0
citations