Mingyu Ding
29
Papers
259
Total Citations
Papers (29)
Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
ECCV 2020
133
citations
SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
CVPR 2024
64
citations
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
ICLR 2024
54
citations
X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios
ICLR 2025
8
citations
RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
ICML 2024
0
citations
Face-Focused Cross-Stream Network for Deception Detection in Videos
CVPR 2019
0
citations
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
CVPR 2020arXiv
0
citations
HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers
CVPR 2021
0
citations
L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing
CVPR 2021
0
citations
Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
CVPR 2023
0
citations
Visual Dependency Transformers: Dependency Tree Emerges From Reversed Attention
CVPR 2023arXiv
0
citations
EC2: Emergent Communication for Embodied Control
CVPR 2023
0
citations
CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
ICCV 2019
0
citations
TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
ICCV 2023
0
citations
Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
ECCV 2020
0
citations
Segmenting Transparent Objects in the Wild
ECCV 2020
0
citations
DaViT: Dual Attention Vision Transformers
ECCV 2022
0
citations
DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation
CVPR 2025
0
citations
CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
CVPR 2025
0
citations
RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
CVPR 2025
0
citations
Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
ICCV 2025
0
citations
Domain-Invariant Projection Learning for Zero-Shot Recognition
NeurIPS 2018
0
citations
Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
NeurIPS 2021
0
citations
Compressed Video Contrastive Learning
NeurIPS 2021
0
citations
LGDN: Language-Guided Denoising Network for Video-Language Modeling
NeurIPS 2022
0
citations
Towards Free Data Selection with General-Purpose Models
NeurIPS 2023
0
citations
EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
NeurIPS 2023
0
citations
Doubly-Robust Self-Training
NeurIPS 2023
0
citations
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
NeurIPS 2023
0
citations