Ming Yang
33
Papers
373
Total Citations
Papers (33)
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
CVPR 2024
236
citations
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
ICLR 2025
59
citations
StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
ECCV 2024
22
citations
Mimir: Improving Video Diffusion Models for Precise Text Understanding
CVPR 2025
16
citations
MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation
CVPR 2025
13
citations
Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis
CVPR 2024
12
citations
SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories
CVPR 2025
11
citations
EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching
ECCV 2024
4
citations
HomoMatcher: Achieving Dense Feature Matching with Semi-Dense Efficiency by Homography Estimation
AAAI 2025
0
citations
Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs
CVPR 2024
0
citations
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
ICML 2024
0
citations
Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms
ICML 2024
0
citations
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
ICML 2024
0
citations
Web-Scale Training for Face Identification
CVPR 2015
0
citations
Conditional Generative Adversarial Network for Structured Domain Adaptation
CVPR 2018
0
citations
Image Blind Denoising With Generative Adversarial Network Based Noise Modeling
CVPR 2018
0
citations
Bi-Directional Cascade Network for Perceptual Edge Detection
CVPR 2019
0
citations
Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians
CVPR 2020
0
citations
Track To Detect and Segment: An Online Multi-Object Tracker
CVPR 2021arXiv
0
citations
Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds
CVPR 2021arXiv
0
citations
SSAP: Single-Shot Instance Segmentation With Affinity Pyramid
ICCV 2019
0
citations
Discriminative Feature Transformation for Occluded Pedestrian Detection
ICCV 2019
0
citations
Stacked Homography Transformations for Multi-View Pedestrian Detection
ICCV 2021
0
citations
Towards Better Vision-Inspired Vision-Language Models
CVPR 2024
0
citations
Reversing Flow for Image Restoration
CVPR 2025
0
citations
SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling
CVPR 2025
0
citations
DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding
CVPR 2025
0
citations
CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance
ICCV 2025
0
citations
Engage for All: Making Ordinary Image Descriptions Appealing Again!
ICCV 2025
0
citations
Social Debiasing for Fair Multi-modal LLMs
ICCV 2025
0
citations
Unified Video Generation via Next-Set Prediction in Continuous Domain
ICCV 2025
0
citations
Orthogonal Non-negative Tensor Factorization based Multi-view Clustering
NeurIPS 2023
0
citations
Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric
NeurIPS 2023
0
citations