Shentong Mo
7
Papers
49
Total Citations
Papers (7)
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
CVPR 2024
31
citations
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation
ECCV 2024arXiv
7
citations
Audio-visual Generalized Zero-shot Learning the Easy Way
ECCV 2024
7
citations
Scaling Diffusion Mamba with Bidirectional SSMs for Efficient 3D Shape Generation
AAAI 2025
3
citations
The Dynamic Duo of Collaborative Masking and Target for Advanced Masked Autoencoder Learning
AAAI 2025
1
citations
Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows
CVPR 2025
0
citations
GMAIL: Generative Modality Alignment for generated Image Learning
ICML 2025
0
citations