Tsu-Jui Fu
12
Papers
20
Total Citations
Papers (12)
STIV: Scalable Text and Image Conditioned Video Generation
ICCV 2025
20
citations
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
ICCV 2025
0
citations
VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
AAAI 2024
0
citations
Dynamic Video Segmentation Network
CVPR 2018arXiv
0
citations
M3L: Language-Based Video Editing via Multi-Modal Multi-Level Transformers
CVPR 2022arXiv
0
citations
An Empirical Study of End-to-End Video-Language Transformers With Masked Visual Modeling
CVPR 2023arXiv
0
citations
Tell Me What Happened: Unifying Text-Guided Video Completion via Multimodal Masked Video Generation
CVPR 2023arXiv
0
citations
Counterfactual Vision-and-Language Navigation via Adversarial Path Sampler
ECCV 2020
0
citations
Language-Driven Artistic Style Transfer
ECCV 2022
0
citations
Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
NeurIPS 2018
0
citations
LayoutGPT: Compositional Visual Planning and Generation with Large Language Models
NeurIPS 2023
0
citations
PHOTOSWAP: Personalized Subject Swapping in Images
NeurIPS 2023
0
citations