2025 "text-to-image generation" Papers

62 papers found • Page 1 of 2

3DIS: Depth-Driven Decoupled Image Synthesis for Universal Multi-Instance Generation

Dewei Zhou, Ji Xie, Zongxin Yang et al.

ICLR 2025poster
1
citations

Ambient Diffusion Omni: Training Good Models with Bad Data

Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.

NeurIPS 2025spotlightarXiv:2506.10038
12
citations

AMO Sampler: Enhancing Text Rendering with Overshooting

Xixi Hu, Keyang Xu, Bo Liu et al.

CVPR 2025posterarXiv:2411.19415
8
citations

Autoregressive Video Generation without Vector Quantization

Haoge Deng, Ting Pan, Haiwen Diao et al.

ICLR 2025oralarXiv:2412.14169
101
citations

BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation

Yuyang Peng, Shishi Xiao, Keming Wu et al.

CVPR 2025posterarXiv:2503.20672
10
citations

CAP: Evaluation of Persuasive and Creative Image Generation

Aysan Aghazadeh, Adriana Kovashka

ICCV 2025posterarXiv:2412.10426
3
citations

CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models

Hyungjin Chung, Jeongsol Kim, Geon Yeong Park et al.

ICLR 2025posterarXiv:2406.08070
77
citations

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Chengyou Jia, Changliang Xia, Zhuohang Dang et al.

CVPR 2025posterarXiv:2411.17176
7
citations

Compositional Discrete Latent Code for High Fidelity, Productive Diffusion Models

Samuel Lavoie, Michael Noukhovitch, Aaron Courville

NeurIPS 2025posterarXiv:2507.12318
1
citations

CompSlider: Compositional Slider for Disentangled Multiple-Attribute Image Generation

Zixin Zhu, Kevin Duarte, Mamshad Nayeem Rizve et al.

ICCV 2025posterarXiv:2509.01028

CPO: Condition Preference Optimization for Controllable Image Generation

Zonglin Lyu, Ming Li, Xinxin Liu et al.

NeurIPS 2025posterarXiv:2511.04753

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.

CVPR 2025posterarXiv:2405.13637
21
citations

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025posterarXiv:2507.04947
4
citations

Deeply Supervised Flow-Based Generative Models

Inkyu Shin, Chenglin Yang, Liang-Chieh Chen

ICCV 2025posterarXiv:2503.14494
8
citations

Denoising Autoregressive Transformers for Scalable Text-to-Image Generation

Jiatao Gu, Yuyang Wang, Yizhe Zhang et al.

ICLR 2025posterarXiv:2410.08159

Dense2MoE: Restructuring Diffusion Transformer to MoE for Efficient Text-to-Image Generation

Youwei Zheng, Yuxi Ren, Xin Xia et al.

ICCV 2025posterarXiv:2510.09094
4
citations

DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models

Hyogon Ryu, NaHyeon Park, Hyunjung Shim

ICLR 2025posterarXiv:2501.04304
7
citations

DISCO: DISCrete nOise for Conditional Control in Text-to-Image Diffusion Models

Longquan Dai, Wu Ming, Dejiao Xue et al.

NeurIPS 2025poster

DSPO: Direct Score Preference Optimization for Diffusion Model Alignment

Huaisheng Zhu, Teng Xiao, Vasant Honavar

ICLR 2025poster
22
citations

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

Xirui Hu, Jiahao Wang, Hao chen et al.

ICCV 2025posterarXiv:2503.06505
8
citations

Exploring Diffusion Transformer Designs via Grafting

Keshigeyan Chandrasegaran, Michael Poli, Dan Fu et al.

NeurIPS 2025oralarXiv:2506.05340
4
citations

FairGen: Enhancing Fairness in Text-to-Image Diffusion Models via Self-Discovering Latent Directions

Yilei Jiang, Wei-Hong Li, Yiyuan Zhang et al.

ICCV 2025posterarXiv:2412.18810
3
citations

Feedback Guidance of Diffusion Models

Felix Koulischer, Florian Handke, Johannes Deleu et al.

NeurIPS 2025posterarXiv:2506.06085
4
citations

FineLIP: Extending CLIP’s Reach via Fine-Grained Alignment with Longer Text Inputs

Mothilal Asokan, Kebin wu, Fatima Albreiki

CVPR 2025posterarXiv:2504.01916
14
citations

Flowing from Words to Pixels: A Noise-Free Framework for Cross-Modality Evolution

Qihao Liu, Xi Yin, Alan L. Yuille et al.

CVPR 2025highlightarXiv:2412.15213
10
citations

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025posterarXiv:2507.15249
1
citations

Goku: Flow Based Video Generative Foundation Models

Shoufa Chen, Chongjian GE, Yuqi Zhang et al.

CVPR 2025highlightarXiv:2502.04896
53
citations

Halton Scheduler for Masked Generative Image Transformer

Victor Besnier, Mickael Chen, David Hurych et al.

ICLR 2025posterarXiv:2503.17076
21
citations

ImgEdit: A Unified Image Editing Dataset and Benchmark

Yang Ye, Xianyi He, Zongjian Li et al.

NeurIPS 2025posterarXiv:2505.20275
84
citations

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

Kaihang Pan, Yang Wu, Wendong Bu et al.

NeurIPS 2025posterarXiv:2506.01480
6
citations

Language-Guided Image Tokenization for Generation

Kaiwen Zha, Lijun Yu, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.05796
23
citations

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025posterarXiv:2411.15466
36
citations

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NeurIPS 2025posterarXiv:2504.00010
6
citations

Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models

Lin Zhu, Xinbing Wang, Chenghu Zhou et al.

ICLR 2025posterarXiv:2502.07466

LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation

Farzad Farhadzadeh, Debasmit Das, Shubhankar Borse et al.

ICLR 2025posterarXiv:2501.16559
6
citations

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Lital Binyamin, Yoad Tewel, Hilit Segev et al.

CVPR 2025posterarXiv:2406.10210
32
citations

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation

Mingcheng Li, Xiaolu Hou, Ziyang Liu et al.

CVPR 2025posterarXiv:2505.02648
8
citations

Measuring And Improving Engagement of Text-to-Image Generation Models

Varun Khurana, Yaman Singla, Jayakumar Subramanian et al.

ICLR 2025poster
2
citations

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
4
citations

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.

NeurIPS 2025posterarXiv:2505.19602
6
citations

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Zhaorun Chen, Zichen Wen, Yichao Du et al.

NeurIPS 2025posterarXiv:2407.04842
57
citations

NL-Eye: Abductive NLI For Images

Mor Ventura, Michael Toker, Nitay Calderon et al.

ICLR 2025posterarXiv:2410.02613
3
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Kwanyoung Kim, Byeongsu Sim

ICCV 2025posterarXiv:2503.07677
1
citations

Precise Parameter Localization for Textual Generation in Diffusion Models

Łukasz Staniszewski, Bartosz Cywiński, Franziska Boenisch et al.

ICLR 2025posterarXiv:2502.09935
3
citations

Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback

Yi-Lun Wu, Bo-Kai Ruan, Chiang Tseng et al.

NeurIPS 2025posterarXiv:2510.18353

RB-Modulation: Training-Free Stylization using Reference-Based Modulation

Litu Rout, Yujia Chen, Nataniel Ruiz et al.

ICLR 2025poster

RespoDiff: Dual-Module Bottleneck Transformation for Responsible & Faithful T2I Generation

Silpa Vadakkeeveetil Sreelatha, Sauradip Nag, Muhammad Awais et al.

NeurIPS 2025posterarXiv:2509.15257

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation

Junsong Chen, Shuchen Xue, Yuyang Zhao et al.

ICCV 2025highlightarXiv:2503.09641
37
citations

Scaling can lead to compositional generalization

Florian Redhardt, Yassir Akram, Simon Schug

NeurIPS 2025spotlightarXiv:2507.07207
← PreviousNext →