"multimodal conditioning" Papers
3 papers found
DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion
Maksim Siniukov, Di Chang, Minh Tran et al.
ICCV 2025posterarXiv:2504.04010
3
citations
Video-Guided Foley Sound Generation with Multimodal Controls
Ziyang Chen, Prem Seetharaman, Bryan Russell et al.
CVPR 2025posterarXiv:2411.17698
38
citations
VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation
Saksham Singh Kushwaha, Yapeng Tian
CVPR 2025posterarXiv:2412.10768
12
citations