Yadong Mu
26
Papers
31
Total Citations
Papers (26)
PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers
NeurIPS 2025
28
citations
Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
AAAI 2025
3
citations
Transferable Video Moment Localization by Moment-Guided Query Prompting
AAAI 2024
0
citations
Exploring Orthogonality in Open World Object Detection
CVPR 2024
0
citations
Ink Dot-Oriented Differentiable Optimization for Neural Image Halftoning
CVPR 2024
0
citations
Countering Personalized Text-to-Image Generation with Influence Watermarks
CVPR 2024
0
citations
Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem
ICML 2024
0
citations
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
ICML 2024
0
citations
Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization
CVPR 2019
0
citations
Weakly-Supervised Action Localization by Generative Attention Modeling
CVPR 2020arXiv
0
citations
Non-Local Neural Networks With Grouped Bilinear Attentional Transforms
CVPR 2020
0
citations
Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context
CVPR 2020
0
citations
Learning Temporal Co-Attention Models for Unsupervised Video Action Localization
CVPR 2020
0
citations
Visual-Semantic Matching by Exploring High-Order Attention and Distraction
CVPR 2020
0
citations
Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer
CVPR 2022
0
citations
Complex Video Action Reasoning via Learnable Markov Logic Network
CVPR 2022
0
citations
Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce
CVPR 2023
0
citations
Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition
CVPR 2023
0
citations
Regularizing Second-Order Influences for Continual Learning
CVPR 2023arXiv
0
citations
Video Action Segmentation via Contextually Refined Temporal Keypoints
ICCV 2023
0
citations
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
ICCV 2025
0
citations
Granularity-Adaptive Spatial Evidence Tokenization for Video Question Answering
AAAI 2025
0
citations
Fast Fourier Convolution
NeurIPS 2020
0
citations
Conditional Diffusion Process for Inverse Halftoning
NeurIPS 2022
0
citations
Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
NeurIPS 2022
0
citations
Rewiring Neurons in Non-Stationary Environments
NeurIPS 2023
0
citations