2025 "text-to-image generation" Papers

142 papers found • Page 2 of 3

FreeCus: Free Lunch Subject-driven Customization in Diffusion Transformers

Yanbing Zhang, Zhe Wang, Qin Zhou et al.

ICCV 2025posterarXiv:2507.15249
1
citations

Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

Jiang Qin, Alexandra Gomez-Villa, Senmao Li et al.

NEURIPS 2025posterarXiv:2503.14275
3
citations

From Cradle to Cane: A Two-Pass Framework for High-Fidelity Lifespan Face Aging

Tao Liu, Dafeng Zhang, Gengchen Li et al.

NEURIPS 2025posterarXiv:2506.20977

Goku: Flow Based Video Generative Foundation Models

Shoufa Chen, Chongjian GE, Yuqi Zhang et al.

CVPR 2025highlightarXiv:2502.04896
53
citations

Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models

Die Chen, Zhiwen Li, Mingyuan Fan et al.

ICLR 2025posterarXiv:2408.01014
6
citations

Guided Score identity Distillation for Data-Free One-Step Text-to-Image Generation

Mingyuan Zhou, Zhendong Wang, Huangjie Zheng et al.

ICLR 2025posterarXiv:2406.01561
4
citations

Halton Scheduler for Masked Generative Image Transformer

Victor Besnier, Mickael Chen, David Hurych et al.

ICLR 2025posterarXiv:2503.17076
21
citations

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Jiazi Bu, Pengyang Ling, Yujie Zhou et al.

NEURIPS 2025posterarXiv:2504.06232
7
citations

ImageGen-CoT: Enhancing Text-to-Image In-context Learning with Chain-of-Thought Reasoning

Jiaqi Liao, Zhengyuan Yang, Linjie Li et al.

ICCV 2025posterarXiv:2503.19312
21
citations

ImgEdit: A Unified Image Editing Dataset and Benchmark

Yang Ye, Xianyi He, Zongjian Li et al.

NEURIPS 2025posterarXiv:2505.20275
84
citations

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Melissa Hall, Michal Drozdzal, Oscar Mañas et al.

ICLR 2025posterarXiv:2403.17804

Infinity∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis

Jian Han, Jinlai Liu, Yi Jiang et al.

CVPR 2025posterarXiv:2412.04431
189
citations

Information Theoretic Text-to-Image Alignment

Chao Wang, Giulio Franzese, alessandro finamore et al.

ICLR 2025posterarXiv:2405.20759
3
citations

Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning

Sherry X. Chen, Misha Sra, Pradeep Sen

CVPR 2025posterarXiv:2503.18406
4
citations

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

Kaihang Pan, Yang Wu, Wendong Bu et al.

NEURIPS 2025posterarXiv:2506.01480
6
citations

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP

Junsung Park, Jungbeom Lee, Jongyoon Song et al.

ICCV 2025posterarXiv:2501.10913
14
citations

Language-Guided Image Tokenization for Generation

Kaiwen Zha, Lijun Yu, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.05796
23
citations

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025posterarXiv:2411.15466
36
citations

LaTexBlend: Scaling Multi-concept Customized Generation with Latent Textual Blending

Jian Jin, Zhenbo Yu, Yang Shen et al.

CVPR 2025highlightarXiv:2503.06956
6
citations

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NEURIPS 2025posterarXiv:2504.00010
6
citations

Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Yihong Luo, Tianyang Hu, Jiacheng Sun et al.

ICCV 2025posterarXiv:2503.06674
11
citations

Less is More: Masking Elements in Image Condition Features Avoids Content Leakages in Style Transfer Diffusion Models

Lin Zhu, Xinbing Wang, Chenghu Zhou et al.

ICLR 2025posterarXiv:2502.07466

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation

Mushui Liu, Yuhang Ma, Zhen Yang et al.

AAAI 2025paperarXiv:2407.00737
33
citations

LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs

Jiarui Wang, Huiyu Duan, Yu Zhao et al.

ICCV 2025highlightarXiv:2504.08358
15
citations

LoRA-X: Bridging Foundation Models with Training-Free Cross-Model Adaptation

Farzad Farhadzadeh, Debasmit Das, Shubhankar Borse et al.

ICLR 2025posterarXiv:2501.16559
6
citations

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Qi Qin, Le Zhuo, Yi Xin et al.

ICCV 2025posterarXiv:2503.21758
55
citations

Make It Count: Text-to-Image Generation with an Accurate Number of Objects

Lital Binyamin, Yoad Tewel, Hilit Segev et al.

CVPR 2025posterarXiv:2406.10210
32
citations

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation

Mingcheng Li, Xiaolu Hou, Ziyang Liu et al.

CVPR 2025posterarXiv:2505.02648
8
citations

Measuring And Improving Engagement of Text-to-Image Generation Models

Varun Khurana, Yaman Singla, Jayakumar Subramanian et al.

ICLR 2025poster
2
citations

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
4
citations

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression

Kunjun Li, Zigeng Chen, Cheng-Yen Yang et al.

NEURIPS 2025posterarXiv:2505.19602
6
citations

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Zhaorun Chen, Zichen Wen, Yichao Du et al.

NEURIPS 2025posterarXiv:2407.04842
57
citations

Multi-Group Proportional Representations for Text-to-Image Models

Sangwon Jung, Alex Oesterling, Claudio Mayrink Verdun et al.

CVPR 2025posterarXiv:2505.24023
2
citations

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025posterarXiv:2507.21391
6
citations

Multi-party Collaborative Attention Control for Image Customization

Han Yang, Chuanguang Yang, Qiuli Wang et al.

CVPR 2025posterarXiv:2505.01428
5
citations

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025posterarXiv:2503.10696
17
citations

NL-Eye: Abductive NLI For Images

Mor Ventura, Michael Toker, Nitay Calderon et al.

ICLR 2025posterarXiv:2410.02613
3
citations

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

Yunhong Min, Daehyeon Choi, Kyeongmin Yeo et al.

NEURIPS 2025posterarXiv:2503.22194
1
citations

Parallel Sequence Modeling via Generalized Spatial Propagation Network

Hongjun Wang, Wonmin Byeon, Jiarui Xu et al.

CVPR 2025posterarXiv:2501.12381
3
citations

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang et al.

ICCV 2025posterarXiv:2509.16968

Personalized Preference Fine-tuning of Diffusion Models

Meihua Dang, Anikait Singh, Linqi Zhou et al.

CVPR 2025posterarXiv:2501.06655
14
citations

PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Kwanyoung Kim, Byeongsu Sim

ICCV 2025posterarXiv:2503.07677
1
citations

PoGDiff: Product-of-Gaussians Diffusion Models for Imbalanced Text-to-Image Generation

Ziyan Wang, Sizhe Wei, Xiaoming Huo et al.

NEURIPS 2025posterarXiv:2502.08106
1
citations

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Yuan Wang, Ouxiang Li, Tingting Mu et al.

CVPR 2025posterarXiv:2412.06143
17
citations

Precise Parameter Localization for Textual Generation in Diffusion Models

Łukasz Staniszewski, Bartosz Cywiński, Franziska Boenisch et al.

ICLR 2025posterarXiv:2502.09935
3
citations

Random Conditioning with Distillation for Data-Efficient Diffusion Model Compression

Dohyun Kim, Sehwan Park, GeonHee Han et al.

CVPR 2025posterarXiv:2504.02011
1
citations

Ranking-based Preference Optimization for Diffusion Models from Implicit User Feedback

Yi-Lun Wu, Bo-Kai Ruan, Chiang Tseng et al.

NEURIPS 2025posterarXiv:2510.18353

RB-Modulation: Training-Free Stylization using Reference-Based Modulation

Litu Rout, Yujia Chen, Nataniel Ruiz et al.

ICLR 2025poster

Rectified CFG++ for Flow Based Models

Shreshth Saini, Shashank Gupta, Alan Bovik

NEURIPS 2025poster

REDUCIO! Generating 1K Video within 16 Seconds using Extremely Compressed Motion Latents

Rui Tian, Qi Dai, Jianmin Bao et al.

ICCV 2025posterarXiv:2411.13552
6
citations