Yapeng Tian

25
Papers
12
Total Citations

Papers (25)

VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation

CVPR 2025
12
citations

Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level

CVPR 2025
0
citations

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

ICCV 2025
0
citations

ZFusion: Efficient Deep Compositional Zero-shot Learning for Blind Image Super-Resolution with Generative Diffusion Prior

ICCV 2025
0
citations

T-VSL: Text-Guided Visual Sound Source Localization in Mixtures

CVPR 2024
0
citations

Residual Dense Network for Image Super-Resolution

CVPR 2018arXiv
0
citations

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

CVPR 2020
0
citations

TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution

CVPR 2020arXiv
0
citations

Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks?

CVPR 2021arXiv
0
citations

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

CVPR 2021arXiv
0
citations

Transformer-Empowered Multi-Scale Contextual Matching and Aggregation for Multi-Contrast MRI Super-Resolution

CVPR 2022arXiv
0
citations

Learning To Answer Questions in Dynamic Audio-Visual Scenarios

CVPR 2022arXiv
0
citations

Structured Sparsity Learning for Efficient Video Super-Resolution

CVPR 2023arXiv
0
citations

Egocentric Audio-Visual Object Localization

CVPR 2023arXiv
0
citations

Audio-Visual Grouping Network for Sound Localization From Mixtures

CVPR 2023arXiv
0
citations

CFSNet: Toward a Controllable Feature Space for Image Restoration

ICCV 2019
0
citations

Video Matting via Consistency-Regularized Graph Neural Networks

ICCV 2021
0
citations

DiffIR: Efficient Diffusion Model for Image Restoration

ICCV 2023arXiv
0
citations

Class-Incremental Grouping Network for Continual Audio-Visual Learning

ICCV 2023arXiv
0
citations

Audio-Visual Class-Incremental Learning

ICCV 2023arXiv
0
citations

Unified Multisensory Perception: Weakly-Supervised Audio-Visual Video Parsing

ECCV 2020
0
citations

Learning Spatio-Temporal Downsampling for Effective Video Upscaling

ECCV 2022
0
citations

Multi-modal Grouping Network for Weakly-Supervised Audio-Visual Video Parsing

NeurIPS 2022
0
citations

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

NeurIPS 2023
0
citations

AV-NeRF: Learning Neural Fields for Real-World Audio-Visual Scene Synthesis

NeurIPS 2023
0
citations