"text-to-image generation" Papers

54 papers found • Page 1 of 2

3DIS: Depth-Driven Decoupled Image Synthesis for Universal Multi-Instance Generation

Dewei Zhou, Ji Xie, Zongxin Yang et al.

ICLR 2025poster
1
citations

Ambient Diffusion Omni: Training Good Models with Bad Data

Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.

NeurIPS 2025spotlightarXiv:2506.10038
12
citations

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025posterarXiv:2411.19415
8
citations

Autoregressive Video Generation without Vector Quantization

Haoge Deng, Ting Pan, Haiwen Diao et al.

ICLR 2025oralarXiv:2412.14169
101
citations

BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Yuyang Peng, Shishi Xiao, Keming Wu et al.

CVPR 2025posterarXiv:2503.20672
10
citations

CPO: Condition Preference Optimization for Controllable Image Generation

Zonglin Lyu, Ming Li, Xinxin Liu et al.

NeurIPS 2025posterarXiv:2511.04753

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025posterarXiv:2507.04947
4
citations

Deeply Supervised Flow-Based Generative Models

Inkyu Shin, Chenglin Yang, Liang-Chieh Chen

ICCV 2025posterarXiv:2503.14494
8
citations

DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models

Longquan Dai, Wu Ming, Dejiao Xue et al.

NeurIPS 2025poster

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025poster
22
citations

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025posterarXiv:2507.15249
1
citations

Goku: Flow Based Video Generative Foundation Models

Shoufa Chen, Chongjian GE, Yuqi Zhang et al.

CVPR 2025highlightarXiv:2502.04896
53
citations

Language-Guided Image Tokenization for Generation

Kaiwen Zha, Lijun Yu, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.05796
23
citations

Measuring And Improving Engagement of Text-to-Image Generation Models

Varun Khurana, Yaman Singla, Jayakumar Subramanian et al.

ICLR 2025poster
2
citations

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
4
citations

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Zhaorun Chen, Zichen Wen, Yichao Du et al.

NeurIPS 2025posterarXiv:2407.04842
57
citations

NL-Eye: Abductive NLI For Images

Mor Ventura, Michael Toker, Nitay Calderon et al.

ICLR 2025posterarXiv:2410.02613
3
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations

Scaling can lead to compositional generalization

Florian Redhardt, Yassir Akram, Simon Schug

NeurIPS 2025spotlightarXiv:2507.07207

Stretching Each Dollar: Diffusion Training from Scratch on a Micro-Budget

Vikash Sehwag, Xianghao Kong, Jingtao Li et al.

CVPR 2025posterarXiv:2407.15811
26
citations

Text-to-Image Rectified Flow as Plug-and-Play Priors

Xiaofeng Yang, Cheng Chen, xulei yang et al.

ICLR 2025posterarXiv:2406.03293
23
citations

Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation

Gianni Franchi, Nacim Belkhir, Dat NGUYEN et al.

CVPR 2025posterarXiv:2412.03178
3
citations

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025posterarXiv:2412.18928
7
citations

Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection

Lichen Bai, Shitong Shao, zikai zhou et al.

ICLR 2025posterarXiv:2412.10891
26
citations

Accelerating Parallel Sampling of Diffusion Models

Zhiwei Tang, Jiasheng Tang, Hao Luo et al.

ICML 2024poster

AltDiffusion: A Multilingual Text-to-Image Diffusion Model

Fulong Ye, Guang Liu, Xinya Wu et al.

AAAI 2024paperarXiv:2308.09991

An Image is Worth Multiple Words: Discovering Object Level Concepts using Multi-Concept Prompt Learning

Chen Jin, Ryutaro Tanno, Amrutha Saseendran et al.

ICML 2024poster

Bespoke Non-Stationary Solvers for Fast Sampling of Diffusion and Flow Models

Neta Shaul, Uriel Singer, Ricky T. Q. Chen et al.

ICML 2024poster

Chains of Diffusion Models

Yanheng Wei, Lianghua Huang, Zhi-Fan Wu et al.

ECCV 2024poster

Compositional Text-to-Image Generation with Dense Blob Representations

Weili Nie, Sifei Liu, Morteza Mardani et al.

ICML 2024poster

Data-free Distillation of Diffusion Models with Bootstrapping

Jiatao Gu, Chen Wang, Shuangfei Zhai et al.

ICML 2024poster

Diffusion Models for Monocular Depth Estimation: Overcoming Challenging Conditions

Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi

ECCV 2024posterarXiv:2407.16698
29
citations

Diffusion Rejection Sampling

Byeonghu Na, Yeongmin Kim, Minsang Park et al.

ICML 2024poster

E$^2$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

Yifan Gong, Zheng Zhan, Qing Jin et al.

ICML 2024poster

Easing Concept Bleeding in Diffusion via Entity Localization and Anchoring

Jiewei Zhang, Song Guo, Peiran Dong et al.

ICML 2024poster

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024posterarXiv:2403.12963
51
citations

Grid-Attention: Enhancing Computational Efficiency of Large Vision Models without Fine-Tuning

Pengyu Li, Biao Wang, Tianchu Guo et al.

ECCV 2024poster

Latent Guard: a Safety Framework for Text-to-image Generation

Runtao Liu, Ashkan Khakzar, Jindong Gu et al.

ECCV 2024posterarXiv:2404.08031
53
citations

Learning Pseudo 3D Guidance for View-consistent Texturing with 2D Diffusion

Kehan Li, Yanbo Fan, Yang Wu et al.

ECCV 2024poster
4
citations

Learning Subject-Aware Cropping by Outpainting Professional Photos

James Hong, Lu Yuan, Michaël Gharbi et al.

AAAI 2024paperarXiv:2312.12080
9
citations

Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs

Ling Yang, Zhaochen Yu, Chenlin Meng et al.

ICML 2024poster

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

Jianhao Yuan, Francesco Pinto, Adam Davies et al.

ICML 2024poster

On Discrete Prompt Optimization for Diffusion Models

Ruochen Wang, Ting Liu, Cho-Jui Hsieh et al.

ICML 2024poster

Progressive Text-to-Image Diffusion with Soft Latent Direction

YuTeng Ye, Jiale Cai, Hang Zhou et al.

AAAI 2024paperarXiv:2309.09466
5
citations

Prompt-tuning Latent Diffusion Models for Inverse Problems

Hyungjin Chung, Jong Chul YE, Peyman Milanfar et al.

ICML 2024poster

Quality Diversity through Human Feedback: Towards Open-Ended Diversity-Driven Optimization

Li Ding, Jenny Zhang, Jeff Clune et al.

ICML 2024poster

Referee Can Play: An Alternative Approach to Conditional Generation via Model Inversion

Xuantong Liu, Tianyang Hu, Wenjia Wang et al.

ICML 2024poster

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Patrick Esser, Sumith Kulal, Andreas Blattmann et al.

ICML 2024poster

Semantic-Aware Human Object Interaction Image Generation

zhu xu, Qingchao Chen, Yuxin Peng et al.

ICML 2024poster

Skews in the Phenomenon Space Hinder Generalization in Text-to-Image Generation

Yingshan Chang, Yasi Zhang, Zhiyuan Fang et al.

ECCV 2024posterarXiv:2403.16394
10
citations
← PreviousNext →