Yadong Mu

26
Papers
31
Total Citations

Papers (26)

PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

NeurIPS 2025
28
citations

Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images

AAAI 2025
3
citations

Transferable Video Moment Localization by Moment-Guided Query Prompting

AAAI 2024
0
citations

Exploring Orthogonality in Open World Object Detection

CVPR 2024
0
citations

Ink Dot-Oriented Differentiable Optimization for Neural Image Halftoning

CVPR 2024
0
citations

Countering Personalized Text-to-Image Generation with Influence Watermarks

CVPR 2024
0
citations

Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem

ICML 2024
0
citations

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

ICML 2024
0
citations

Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization

CVPR 2019
0
citations

Weakly-Supervised Action Localization by Generative Attention Modeling

CVPR 2020arXiv
0
citations

Non-Local Neural Networks With Grouped Bilinear Attentional Transforms

CVPR 2020
0
citations

Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context

CVPR 2020
0
citations

Learning Temporal Co-Attention Models for Unsupervised Video Action Localization

CVPR 2020
0
citations

Visual-Semantic Matching by Exploring High-Order Attention and Distraction

CVPR 2020
0
citations

Joint Video Summarization and Moment Localization by Cross-Task Sample Transfer

CVPR 2022
0
citations

Complex Video Action Reasoning via Learnable Markov Logic Network

CVPR 2022
0
citations

Learning Instance-Level Representation for Large-Scale Multi-Modal Pretraining in E-Commerce

CVPR 2023
0
citations

Neural Koopman Pooling: Control-Inspired Temporal Dynamics Encoding for Skeleton-Based Action Recognition

CVPR 2023
0
citations

Regularizing Second-Order Influences for Continual Learning

CVPR 2023arXiv
0
citations

Video Action Segmentation via Contextually Refined Temporal Keypoints

ICCV 2023
0
citations

NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation

ICCV 2025
0
citations

Granularity-Adaptive Spatial Evidence Tokenization for Video Question Answering

AAAI 2025
0
citations

Fast Fourier Convolution

NeurIPS 2020
0
citations

Conditional Diffusion Process for Inverse Halftoning

NeurIPS 2022
0
citations

Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding

NeurIPS 2022
0
citations

Rewiring Neurons in Non-Stationary Environments

NeurIPS 2023
0
citations