Irfan Essa
15
Papers
827
Total Citations
Papers (15)
Language Model Beats Diffusion - Tokenizer is key to visual generation
ICLR 2024
525
citations
Photorealistic Video Generation with Diffusion Models
ECCV 2024
264
citations
Calibrated Multi-Preference Optimization for Aligning Diffusion Models
CVPR 2025
24
citations
Limitations in Employing Natural Language Supervision for Sensor-Based Human Activity Recognition - And Ways to Overcome Them
AAAI 2025
9
citations
Cropper: Vision-Language Model for Image Cropping through In-Context Learning
CVPR 2025
5
citations
Visual Prompt Tuning for Generative Transfer Learning
CVPR 2023arXiv
0
citations
MAGVIT: Masked Generative Video Transformer
CVPR 2023arXiv
0
citations
MaskSketch: Unpaired Structure-Guided Masked Image Generation
CVPR 2023arXiv
0
citations
Neural Design Network: Graphic Layout Generation with Constraints
ECCV 2020
0
citations
BLT: Bidirectional Layout Transformer for Controllable Layout Generation
ECCV 2022
0
citations
Improved Masked Image Generation with Token-Critic
ECCV 2022
0
citations
Prompt-Free Diffusion: Taking “Text” out of Text-to-Image Diffusion Models
CVPR 2024
0
citations
VideoPoet: A Large Language Model for Zero-Shot Video Generation
ICML 2024
0
citations
Embodied Question Answering in Photorealistic Environments With Point Cloud Perception
CVPR 2019
0
citations
Audio Visual Scene-Aware Dialog
CVPR 2019
0
citations