Ming Yang

33
Papers
373
Total Citations

Papers (33)

SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery

CVPR 2024
236
citations

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

ICLR 2025
59
citations

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

ECCV 2024
22
citations

Mimir: Improving Video Diffusion Models for Precise Text Understanding

CVPR 2025
16
citations

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

CVPR 2025
13
citations

Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis

CVPR 2024
12
citations

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

CVPR 2025
11
citations

EcoMatcher: Efficient Clustering Oriented Matcher for Detector-free Image Matching

ECCV 2024
4
citations

HomoMatcher: Achieving Dense Feature Matching with Semi-Dense Efficiency by Homography Estimation

AAAI 2025
0
citations

Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs

CVPR 2024
0
citations

SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment

ICML 2024
0
citations

Stability and Generalization of Stochastic Compositional Gradient Descent Algorithms

ICML 2024
0
citations

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

ICML 2024
0
citations

Web-Scale Training for Face Identification

CVPR 2015
0
citations

Conditional Generative Adversarial Network for Structured Domain Adaptation

CVPR 2018
0
citations

Image Blind Denoising With Generative Adversarial Network Based Noise Modeling

CVPR 2018
0
citations

Bi-Directional Cascade Network for Perceptual Edge Detection

CVPR 2019
0
citations

Temporal-Context Enhanced Detection of Heavily Occluded Pedestrians

CVPR 2020
0
citations

Track To Detect and Segment: An Online Multi-Object Tracker

CVPR 2021arXiv
0
citations

Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds

CVPR 2021arXiv
0
citations

SSAP: Single-Shot Instance Segmentation With Affinity Pyramid

ICCV 2019
0
citations

Discriminative Feature Transformation for Occluded Pedestrian Detection

ICCV 2019
0
citations

Stacked Homography Transformations for Multi-View Pedestrian Detection

ICCV 2021
0
citations

Towards Better Vision-Inspired Vision-Language Models

CVPR 2024
0
citations

Reversing Flow for Image Restoration

CVPR 2025
0
citations

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

CVPR 2025
0
citations

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

CVPR 2025
0
citations

CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance

ICCV 2025
0
citations

Engage for All: Making Ordinary Image Descriptions Appealing Again!

ICCV 2025
0
citations

Social Debiasing for Fair Multi-modal LLMs

ICCV 2025
0
citations

Unified Video Generation via Next-Set Prediction in Continuous Domain

ICCV 2025
0
citations

Orthogonal Non-negative Tensor Factorization based Multi-view Clustering

NeurIPS 2023
0
citations

Efficient Potential-based Exploration in Reinforcement Learning using Inverse Dynamic Bisimulation Metric

NeurIPS 2023
0
citations