🧬Generative Models

Image Synthesis

General image generation and synthesis techniques

100 papers3,115 total citations

Compare with other topics

Feb '24 — Jan '26451 papers

Top Conferences

CVPR: 45 ECCV: 16 ICLR: 13 AAAI: 12 ICCV: 8 NeurIPS: 6

Top Papers

#1

OmniGen: Unified Image Generation

Shitao Xiao, Yueze Wang, Junjie Zhou et al.

Emu Edit: Precise Image Editing via Recognition and Generation Tasks

Shelly Sheynin, Adam Polyak, Uriel Singer et al.

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

Shunyuan Zheng, Boyao ZHOU, Ruizhi Shao et al.

Grounded Text-to-Image Synthesis with Attention Refocusing

Quynh Phung, Songwei Ge, Jia-Bin Huang

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

Dewei Zhou, You Li, Fan Ma et al.

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Shaowei Liu, Zhongzheng Ren, Saurabh Gupta et al.

ECCV 2024arXiv:2409.18964

image-to-video generationrigid-body physicsphysics-grounded generationimage-space dynamics+4

104

citations

#7

Generative Image Dynamics

Zhengqi Li, Richard Tucker, Noah Snavely et al.

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

Alexander Raistrick, Lingjie Mei, Karhan Kayan et al.

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

Kim Youwang, Tae-Hyun Oh, Gerard Pons-Moll

MaskBit: Embedding-free Image Generation via Bit Tokens

Mark Weber, Lijun Yu, Qihang Yu et al.

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Evonne Ng, Javier Romero, Timur Bagautdinov et al.

MV-Adapter: Multi-View Consistent Image Generation Made Easy

Zehuan Huang, Yuan-Chen Guo, Haoran Wang et al.

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

XINJIE ZHANG, Xingtong Ge, Tongda Xu et al.

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis

Yanzuo Lu, Manlin Zhang, Jinhua Ma et al.

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Xi Chen, Sida Peng, Dongchen Yang et al.

ECCV 2024arXiv:2404.11593

inverse renderingmaterial recoverydiffusion priorsunknown illumination+4

54

citations

#16

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Qi Qin, Le Zhuo, Yi Xin et al.

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

Linjiang Huang, Rongyao Fang, Aiping Zhang et al.

ECCV 2024arXiv:2403.12963

high-resolution image synthesisdiffusion modelsfrequency domain analysistraining-free generation+4

51

citations

#18

GAIA: Zero-shot Talking Avatar Generation

Tianyu He, Junliang Guo, Runyi Yu et al.

Image Conductor: Precision Control for Interactive Video Synthesis

Yaowei Li, Xintao Wang, Zhaoyang Zhang et al.

SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors

Dave Zhenyu Chen, Haoxuan Li, Hsin-Ying Lee et al.

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

Clément Chadebec, Onur Tasar, Eyal Benaroche et al.

ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation

Moayed Haji Ali, Guha Balakrishnan, Vicente Ordonez

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models

Feifei Wang, Zhentao Tan, Tianyi Wei et al.

FreeVS: Generative View Synthesis on Free Driving Trajectory

Qitai Wang, Lue Fan, Yuqi Wang et al.

High-fidelity Person-centric Subject-to-Image Synthesis

Yibin Wang, Weizhong Zhang, Jianwei Zheng et al.

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis

Zanlin Ni, Yulin Wang, Renping Zhou et al.

DiffuseHigh: Training-Free Progressive High-Resolution Image Synthesis Through Structure Guidance

Younghyun Kim, Geunmin Hwang, Junyu Zhang et al.

2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation

Chengyou Jia, Minnan Luo, Zhuohang Dang et al.

DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis

Yuming Gu, Hongyi Xu, You Xie et al.

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

Lijun Li, Zhelun Shi, Xuhao Hu et al.

MagicQuill: An Intelligent Interactive Image Editing System

Zichen Liu, Yue Yu, Hao Ouyang et al.

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Bowen Chen, Brynn zhao, Haomiao Sun et al.

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

Bohan Li, Xiao Xu, Xinghao Wang et al.

AAAI 2024arXiv:2302.02070

image augmentationdiffusion modelssemantic consistencyimage classification+2

24

citations

#34

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

Zhengyao Lv, Yuxiang Wei, Wangmeng Zuo et al.

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Denis Zavadski, Johann-Friedrich Feiden, Carsten Rother

Material Anything: Generating Materials for Any 3D Object via Diffusion

Xin Huang, Tengfei Wang, Ziwei Liu et al.

StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation

Sidi Wu, Yizi Chen, Loic Landrieu et al.

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

Zhaoyuan Yang, Zhengyang Yu, Zhiwei Xu et al.

GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering

Hongze CHEN, Zehong Lin, Jun Zhang

ICLR 2025arXiv:2410.02619

inverse rendering3d gaussian splattingglobal illuminationdeferred shading+4

21

citations

#40

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

Ruoxi Chen, Haibo Jin, Yixin Liu et al.

ECCV 2024arXiv:2311.12066

instruction-guided diffusion modelsunauthorized image manipulationimage editing protectionlatent representation perturbation+3

20

citations

#41

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Yifan Pu, Yiming Zhao, Zhicong Tang et al.

Generative Image Layer Decomposition with Visual Effects

Jinrui Yang, Qing Liu, Yijun Li et al.

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Chaoyang Wang, Peiye Zhuang, Tuan Duc Ngo et al.

SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

Teng Hu, Ran Yi, Baihong Qian et al.

You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs

Yihong Luo, Xiaolong Chen, Xinghua Qu et al.

Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping

Zijie Pan, Jiachen Lu, Xiatian Zhu et al.

Condition-Aware Neural Network for Controlled Image Generation

Han Cai, Muyang Li, Qinsheng Zhang et al.

DreamOmni: Unified Image Generation and Editing

Bin Xia, Yuechen Zhang, Jingyao Li et al.

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

Xiyi Chen, Marko Mihajlovic, Shaofei Wang et al.

ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis

Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang et al.

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

One-Shot Structure-Aware Stylized Image Synthesis

Hansam Cho, Jonghyun Lee, Seunggyu Chang et al.

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering

Zhihao Huang, Xi Qiu, Yukuo Ma et al.

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Nupur Kumari, Xi Yin, Jun-Yan Zhu et al.

Low-Light Image Enhancement via Generative Perceptual Priors

Han Zhou, Wei Dong, Xiaohong Liu et al.

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

Guian Fang, Wenbiao Yan, Yuanfan Guo et al.

ECCV 2024arXiv:2407.06937

text-to-image diffusionhuman anomaly generationanatomical anomaly detectionpose-reversible guidance+3

14

citations

#57

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.

CVPR 2025arXiv:2503.18286

synthetic image detectiongenerative ai detectionsemantic feature enhancementartifact feature analysis+4

14

citations

#58

InstructGIE: Towards Generalizable Image Editing

Zichong Meng, Changdi Yang, Jun Liu et al.

ECCV 2024arXiv:2403.05018

image editingdenoising diffusion modelsin-context learninglanguage instruction+4

13

citations

#59

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Guiyu Zhang, Huan-ang Gao, Zijian Jiang et al.

MagicEraser: Erasing Any Objects via Semantics-Aware Control

FAN LI, Zixiao Zhang, Yi Huang et al.

Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis

Yu Yuan, Xijun Wang, Yichen Sheng et al.

CVPR 2025arXiv:2412.02168

text-to-image synthesiscamera controlscene consistencydimensionality lifting+3

13

citations

#62

Editable Image Elements for Controllable Synthesis

Jiteng Mu, Michael Gharbi, Richard Zhang et al.

Image Generation Diversity Issues and How to Tame Them

Mischa Dombrowski, Weitong Zhang, Hadrien Reynaud et al.

∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions

Minh Quan Le, Alexandros Graikos, Srikar Yellapragada et al.

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior

Zhekai Chen, Wen Wang, Zhen Yang et al.

Yuan: Yielding Unblemished Aesthetics Through a Unified Network for Visual Imperfections Removal in Generated Images

Zhenyu Yu, Chee Seng Chan

Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing

Ruiyi Wang, Yushuo Zheng, Zicheng Zhang et al.

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing

Jing Gu, Nanxuan Zhao, Wei Xiong et al.

Hidden in the Noise: Two-Stage Robust Watermarking for Images

Kasra Arabi, Benjamin Feuer, R. Teal Witter et al.

SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning

Zhewei Dai, Shilei Zeng, Haotian Liu et al.

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

Jing He, Haodong Li, huyongzhe et al.

Training-free Composite Scene Generation for Layout-to-Image Synthesis

Jiaqi Liu, Tao Huang, Chang Xu

InsightEdit: Towards Better Instruction Following for Image Editing

Yingjing Xu, Jie Kong, Jiazhi Wang et al.

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

Qihan Huang, Weilong Dai, Jinlong Liu et al.

CVPR 2025arXiv:2412.03177

personalized image generationdirect preference optimizationpatch-level optimizationfinetuning-free generation+3

10

citations

#75

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

Hantao Zhang, Yuhe Liu, Jiancheng Yang et al.

DeepCalliFont: Few-Shot Chinese Calligraphy Font Synthesis by Integrating Dual-Modality Generative Models

Yitian Liu, Zhouhui Lian

AAAI 2024arXiv:2312.10314

few-shot font generationchinese calligraphy synthesisdual-modality generative modelsglyph image synthesis+4

9

citations

#77

GAS: Generative Avatar Synthesis from a Single Image

Yixing Lu, Junting Dong, YoungJoong Kwon et al.

LITA-GS: Illumination-Agnostic Novel View Synthesis via Reference-Free 3D Gaussian Splatting and Physical Priors

Han Zhou, Wei Dong, Jun Chen

Layered Image Vectorization via Semantic Simplification

Zhenyu Wang, Jianxi Huang, Zhida Sun et al.

ScribbleLight: Single Image Indoor Relighting with Scribbles

Jun Myeong Choi, Annie N. Wang, Pieter Peers et al.

Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering

Yibo Zhang, Lihong Wang, Changqing Zou et al.

ICLR 2025arXiv:2405.15305

differentiable rendering3d parametric curvesview-consistent 3d sketchrational bézier curves+4

9

citations

#82

Learning Subject-Aware Cropping by Outpainting Professional Photos

James Hong, Lu Yuan, Michaël Gharbi et al.

AAAI 2024arXiv:2312.12080

subject-aware image croppingweakly-supervised learningdiffusion modelsimage outpainting+2

9

citations

#83

GenDeg: Diffusion-based Degradation Synthesis for Generalizable All-In-One Image Restoration

Sudarshan Rajagopalan, Nithin Gopalakrishnan Nair, Jay Paranjape et al.

ProCreate, Don't Reproduce! Propulsive Energy Diffusion for Creative Generation

Jack Lu, Ryan Teehan, Mengye Ren

Diffusion-based Synthetic Data Generation for Visible-Infrared Person Re-Identification

Wenbo Dai, Lijing Lu, Zhihang Li

PolyOculus: Simultaneous Multi-view Image-based Novel View Synthesis

Jason Yu, Tristan Aumentado-Armstrong, Fereshteh Forghani et al.

From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes

Long Ma, Zhiyuan Yan, Jin Xu et al.

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

You Li, Fan Ma, Yi Yang

Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation

Akshay Krishnan, Xinchen Yan, Vincent Casser et al.

SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent

Yandan Yang, Baoxiong Jia, Shujie Zhang et al.

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

Automatic Controllable Colorization via Imagination

Xiaoyan Cong, Yue Wu, Qifeng Chen et al.

Boost Your Human Image Generation Model via Direct Preference Optimization

Sanghyeon Na, Yonggyu Kim, Hyunjoon Lee

CVPR 2025arXiv:2405.20216

human image generationdirect preference optimizationtext-to-image synthesispersonalized image generation+3

8

citations

#94

Spectrum Translation for Refinement of Image Generation (STIG) Based on Contrastive Learning and Spectral Filter Profile

Seokjun Lee, Seung-Won Jung, Hyunseok Seo

AAAI 2024arXiv:2403.05093

spectrum translationfrequency domain discrepancygenerative adversarial networksdiffusion models+4

8

citations

#95

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation

Xie Tianyidan, Rui Ma, Qian Wang et al.

HQGS: High-Quality Novel View Synthesis with Gaussian Splatting in Degraded Scenes

Xin Lin, Shi Luo, Xiaojun Shan et al.

Turbo2K: Towards Ultra-Efficient and High-Quality 2K Video Synthesis

Jingjing Ren, Wenbo Li, Zhongdao Wang et al.

UltraHR-100K: Enhancing UHR Image Synthesis with A Large-Scale High-Quality Dataset

Chen Zhao, En Ci, Yunzhe Xu et al.

StreamGS: Online Generalizable Gaussian Splatting Reconstruction for Unposed Image Streams

Yang LI, Jinglu Wang, Lei Chu et al.

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Yuze He, Yanning Zhou, Wang Zhao et al.

CVPR 2025

7

citations

Image Synthesis

Top Conferences

Related Topics (Generative Models)

Top Papers

OmniGen: Unified Image Generation

Emu Edit: Precise Image Editing via Recognition and Generation Tasks

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

Grounded Text-to-Image Synthesis with Attention Refocusing

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Generative Image Dynamics

Infinigen Indoors: Photorealistic Indoor Scenes using Procedural Generation

Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

MaskBit: Embedding-free Image Generation via Bit Tokens

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

MV-Adapter: Multi-View Consistent Image Generation Made Easy

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

Coarse-to-Fine Latent Diffusion for Pose-Guided Person Image Synthesis

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

GAIA: Zero-shot Talking Avatar Generation

Image Conductor: Precision Control for Interactive Video Synthesis

SceneTex: High-Quality Texture Synthesis for Indoor Scenes via Diffusion Priors

Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation

ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation

SimAC: A Simple Anti-Customization Method for Protecting Face Privacy against Text-to-Image Synthesis of Diffusion Models

FreeVS: Generative View Synthesis on Free Driving Trajectory

High-fidelity Person-centric Subject-to-Image Synthesis

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis

DiffuseHigh: Training-Free Progressive High-Resolution Image Synthesis Through Structure Guidance

2382 SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-Form Layout-to-Image Generation

DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis

T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

MagicQuill: An Intelligent Interactive Image Editing System

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

PLACE: Adaptive Layout-Semantic Fusion for Semantic Image Synthesis

ControlNet-XS: Rethinking the Control of Text-to-Image Diffusion Models as Feedback-Control Systems

Material Anything: Generating Materials for Any 3D Object via Diffusion

StegoGAN: Leveraging Steganography for Non-Bijective Image-to-Image Translation

IMPUS: Image Morphing with Perceptually-Uniform Sampling Using Diffusion Models

GI-GS: Global Illumination Decomposition on Gaussian Splatting for Inverse Rendering

EditShield: Protecting Unauthorized Image Editing by Instruction-guided Diffusion Models

ART: Anonymous Region Transformer for Variable Multi-Layer Transparent Image Generation

Generative Image Layer Decomposition with Visual Effects

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis

You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs

Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping

Condition-Aware Neural Network for Controlled Image Generation

DreamOmni: Unified Image Generation and Editing

Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation

ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis

Is Artificial Intelligence Generated Image Detection a Solved Problem?

One-Shot Structure-Aware Stylized Image Synthesis

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Low-Light Image Enhancement via Generative Perceptual Priors

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

InstructGIE: Towards Generalizable Image Editing

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

MagicEraser: Erasing Any Objects via Semantics-Aware Control

Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis

Editable Image Elements for Controllable Synthesis

Image Generation Diversity Issues and How to Tame Them

∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions

FreeCompose: Generic Zero-Shot Image Composition with Diffusion Prior

Yuan: Yielding Unblemished Aesthetics Through a Unified Network for Visual Imperfections Removal in Generated Images

Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Image Editing

Hidden in the Noise: Two-Stage Robust Watermarking for Images

SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

Training-free Composite Scene Generation for Layout-to-Image Synthesis

InsightEdit: Towards Better Instruction Following for Image Editing

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

DeepCalliFont: Few-Shot Chinese Calligraphy Font Synthesis by Integrating Dual-Modality Generative Models