Generative Modeling

CVPR 2024arXiv:2312.13286

#2

A Generalist Agent

Jackie Kay, Sergio Gómez Colmenarejo, Mahyar Bordbar et al.

Generative Multimodal Models are In-Context Learners

Quan Sun, Yufeng Cui, Xiaosong Zhang et al.

422

ICLR 2024arXiv:2310.01801

#4

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Suyu Ge, Yunan Zhang, Liyuan Liu et al.

372

ICLR 2025arXiv:2402.09906

#5

Generative Verifiers: Reward Modeling as Next-Token Prediction

Lunjun Zhang, Arian Hosseini, Hritik Bansal et al.

Generative Representational Instruction Tuning

Niklas Muennighoff, Hongjin SU, Liang Wang et al.

instruction tuningtext embeddinggenerative tasksretrieval-augmented generation+3

212

CVPR 2024arXiv:2311.16099

#7

GART: Gaussian Articulated Template Models

Jiahui Lei, Yufu Wang, Georgios Pavlakos et al.

129

ICLR 2024arXiv:2310.01361

#8

GenSim: Generating Robotic Simulation Tasks via Large Language Models

Lirui Wang, Yiyang Ling, Zhecheng Yuan et al.

120

CVPR 2024arXiv:2309.07906

#9

Generative Image Dynamics

Zhengqi Li, Richard Tucker, Noah Snavely et al.

93

CVPR 2024arXiv:2401.00374

#10

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Yuang Peng, Yuxin Cui, Haomiao Tang et al.

EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling

Haiyang Liu, Zihao Zhu, Giorgio Becherini et al.

78

ICLR 2025arXiv:2409.16211

#12

Improving Diffusion Models for Authentic Virtual Try-on in the Wild

Choi Yisol, Sangkyung Kwak, Kyungmin Lee et al.

MaskBit: Embedding-free Image Generation via Bit Tokens

Mark Weber, Lijun Yu, Qihang Yu et al.

72

ICML 2025arXiv:2410.05363

#14

Towards World Simulator: Crafting Physical Commonsense-Based Benchmark for Video Generation

Fanqing Meng, Jiaqi Liao, Xinyu Tan et al.

72

CVPR 2024arXiv:2405.12979

#15

OmniGlue: Generalizable Feature Matching with Foundation Model Guidance

Hanwen Jiang, Arjun Karpur, Bingyi Cao et al.

68

ECCV 2024arXiv:2312.02116

#16

GIVT: Generative Infinite-Vocabulary Transformers

Michael Tschannen, Cian Eastwood, Fabian Mentzer

generative transformersvector sequence generationmultivariate gaussian mixture modelslatent diffusion models+4

63

ICLR 2025arXiv:2406.09961

#17

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

Cheng Yang, Chufan Shi, Yaxin Liu et al.

chart-to-code generationmultimodal reasoning evaluationvisual understandingcode generation+4

63

ECCV 2024arXiv:2404.11593

#18

Proteina: Scaling Flow-based Protein Structure Generative Models

Tomas Geffner, Kieran Didi, Zuobai Zhang et al.

IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Xi Chen, Sida Peng, Dongchen Yang et al.

inverse renderingmaterial recoverydiffusion priorsunknown illumination+4

52

ICLR 2024arXiv:2309.16779

#20

AvatarGPT: All-in-One Framework for Motion Understanding Planning Generation and Beyond

Zixiang Zhou, Yu Wan, Baoyuan Wang

Intriguing Properties of Generative Classifiers

Priyank Jaini, Kevin Clark, Robert Geirhos

51

ICLR 2025arXiv:2403.14404

#22

Physics-Informed Diffusion Models

Jan-Hendrik Bastek, WaiChing Sun, Dennis Kochmann

diffusion modelsphysics-informed learningpartial differential equationsgenerative modeling+4

50

ICLR 2024arXiv:2306.01776

#23

From Zero to Turbulence: Generative Modeling for 3D Flow Simulation

Marten Lienen, David Lüdke, Jan Hansen-Palmus et al.

49

ICLR 2024arXiv:2310.02710

#24

DGInStyle: Domain-Generalizable Semantic Segmentation with Image Diffusion Models and Stylized Semantic Control

Yuru Jia, Lukas Hoyer, Shengyu Huang et al.

Local Search GFlowNets

Minsu Kim, Yun Taeyoung, Emmanuel Bengio et al.

48

ICLR 2025arXiv:2410.21357

#26

Energy-Based Diffusion Language Models for Text Generation

Minkai Xu, Tomas Geffner, Karsten Kreis et al.

48

ICCV 2025arXiv:2412.18607

#27

GAIA: Zero-shot Talking Avatar Generation

Tianyu He, Junliang Guo, Runyi Yu et al.

DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers

Yuntao Chen, Yuqi Wang, Zhaoxiang Zhang

44

CVPR 2024arXiv:2306.09337

#29

Generative Proxemics: A Prior for 3D Social Interaction from Images

Vickie Ye, Vickie Ye, Georgios Pavlakos et al.

41

ICLR 2025arXiv:2410.20587

#30

Generator Matching: Generative modeling with arbitrary Markov processes

Peter Holderrieth, Marton Havasi, Jason Yim et al.

generative modelingmarkov processesdiffusion modelsflow matching+4

40

CVPR 2024arXiv:2403.13304

#31

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

Yibo Wang, Ruiyuan Gao, Kai Chen et al.

39

ICLR 2025arXiv:2410.00752

#32

TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark

Kush Jain, Gabriel Synnaeve, Baptiste Roziere

39

CVPR 2024arXiv:2403.04272

#33

UGG: Unified Generative Grasping

Jiaxin Lu, Hao Kang, Haoxiang Li et al.

Active Generalized Category Discovery

Shijie Ma, Fei Zhu, Zhun Zhong et al.

34

AAAI 2024arXiv:2402.07225

#35

Rethinking Generalizable Face Anti-spoofing via Hierarchical Prototype-guided Distribution Refinement in Hyperbolic Space

Chengyang Hu, Ke-Yue Zhang, Taiping Yao et al.

Rethinking Graph Masked Autoencoders through Alignment and Uniformity

Liang Wang, Xiang Tao, Qiang Liu et al.

graph masked autoencodersgraph contrastive learningself-supervised learningalignment and uniformity+3

32

ICLR 2024arXiv:2309.15564

#37

Jointly Training Large Autoregressive Multimodal Models

Emanuele Aiello, Lili Yu, Yixin Nie et al.

32

ECCV 2024arXiv:2404.00995

#38

PosterLlama: Bridging Design Ability of Langauge Model to Content-Aware Layout Generation

Jaejung Seol, Seojun Kim, Jaejun Yoo

layout generationcontent-aware designhtml code generationlanguage model knowledge+4

30

ECCV 2024arXiv:2407.10387

#39

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity

Santiago Pascual, Chunghsin YEH, Ioannis Tsiamas et al.

video-to-audio generationaudio-visual synchronizationgenerative audio codecmasked generative model+2

30

ICLR 2025arXiv:2410.03355

#40

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding

Doohyuk Jang, Sihwan Park, June Yong Yang et al.

29

AAAI 2024arXiv:2312.12236

#41

OmniPhysGS: 3D Constitutive Gaussians for General Physics-Based Dynamics Generation

Yuchen Lin, Chenguo Lin, Jianjin Xu et al.

Generalization Analysis of Machine Learning Algorithms via the Worst-Case Data-Generating Probability Measure

Xinying Zou, Samir Perlaza, Inaki Esnaola et al.

worst-case probability measuregeneralization gap analysisgibbs probability measureexpected loss sensitivity+4

26

CVPR 2024arXiv:2312.01409

#43

Generative Rendering: Controllable 4D-Guided Video Generation with 2D Diffusion Models

Shengqu Cai, Duygu Ceylan, Matheus Gadelha et al.

26

ICLR 2024arXiv:2312.11529

#44

Efficient and Scalable Graph Generation through Iterative Local Expansion

Andreas Bergmeister, Karolis Martinkus, Nathanaël Perraudin et al.

26

CVPR 2024arXiv:2401.08739

#45

EgoGen: An Egocentric Synthetic Data Generator

Gen Li, Kaifeng Zhao, Siwei Zhang et al.

ICLR 2024arXiv:2405.09901

#46

Whole-Song Hierarchical Generation of Symbolic Music Using Cascaded Diffusion Models

Ziyu Wang, Lejun Min, Gus Xia

CVPR 2024arXiv:2312.07063

#47

Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation

Xianghui Xie, Bharat Lal Bhatnagar, Jan Lenssen et al.

AAAI 2024arXiv:2302.02070

#48

Semantic-Guided Generative Image Augmentation Method with Diffusion Models for Image Classification

Bohan Li, Xiao Xu, Xinghao Wang et al.

image augmentationdiffusion modelssemantic consistencyimage classification+2

AAAI 2024arXiv:2303.11048

#49

JetFormer: An autoregressive generative model of raw images and text

Michael Tschannen, André Susano Pinto, Alexander Kolesnikov

SGFormer: Semantic Graph Transformer for Point Cloud-Based 3D Scene Graph Generation

Changsheng Lv, Mengshi Qi, Xia Li et al.

3d scene graph generationpoint cloud parsingsemantic graph transformerglobal information passing+4

23

CVPR 2024arXiv:2404.03242

#51

Would Deep Generative Models Amplify Bias in Future Models?

Tianwei Chen, Yusuke Hirota, Mayu Otani et al.

23

dyadic interaction modelingsocial behavior generation3d facial motioncontrastive learning+4

#52

DIM: Dyadic Interaction Modeling for Social Behavior Generation

Minh Tran, Di Chang, Maksim Siniukov et al.

ECCV 2024

22

AAAI 2024arXiv:2312.15665

#53

A Multi-Modal Contrastive Diffusion Model for Therapeutic Peptide Generation

Yongkang Wang, Xuan Liu, Feng Huang et al.

therapeutic peptide generationmulti-modal fusioncontrastive learningdiffusion models+3

22

AAAI 2025arXiv:2407.02252

#54

GlyphDraw2: Automatic Generation of Complex Glyph Posters with Diffusion Models and Large Language Models

Jian Ma, Yonglin Deng, Chen Chen et al.

22

ECCV 2024arXiv:2403.17933

#55

SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic

Kashyap Chitta, Daniel Dauner, Andreas Geiger

driving environment synthesisvehicle motion planninggenerative simulationlane graph representation+4

21

ICLR 2025arXiv:2406.18966

#56

DataGen: Unified Synthetic Dataset Generation via Large Language Models

Yue Huang, Siyuan Wu, Chujie Gao et al.

synthetic data generationlarge language modelsretrieval-augmented generationlabel verification+4

21

ICLR 2024arXiv:2307.13883

#57

Generative Video Propagation

Shaoteng Liu, Tianyu Wang, Jui-Hsien Wang et al.

Towards Open Domain Text-Driven Synthesis of Multi-Person Motions

Shan Mengyi, Lu Dong, Yutao Han et al.

ExeDec: Execution Decomposition for Compositional Generalization in Neural Program Synthesis

Kensen Shi, Joey Hong, Yinlin Deng et al.

20

ICLR 2024arXiv:2311.00136

#60

On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity

Quentin Bertrand, Anne Gagneux, Mathurin Massias et al.

Distinguished In Uniform: Self-Attention Vs. Virtual Nodes

Eran Rosenbluth, Jan Tönshoff, Martin Ritzert et al.

To Grok or not to Grok: Disentangling Generalization and Memorization on Corrupted Algorithmic Datasets

Darshil Doshi, Aritra Das, Tianyu He et al.

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.

Neuroformer: Multimodal and Multitask Generative Pretraining for Brain Data

Antonis Antoniades, Yiyi Yu, Joe Canzano et al.

19

AAAI 2024arXiv:2401.06521

#65

Exploring Diverse Representations for Open Set Recognition

Yu Wang, Junxian Mu, Pengfei Zhu et al.

open set recognitionattention diversity regularizationmulti-expert fusiondiscriminative models+4

18

ICLR 2025arXiv:2405.01155

#66

GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation

Hongyin Zhang, Pengxiang Ding, Shangke Lyu et al.

SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints

Miruna Cretu, Charles Harris, Ilia Igashov et al.

generative modelscomputer-aided drug designsynthetic accessibilitychemical reaction space+4

17

ICLR 2025arXiv:2410.23054

#68

Controlling Language and Diffusion Models by Transporting Activations

Pau Rodriguez, Arno Blaas, Michal Klein et al.

17

CVPR 2024arXiv:2403.05239

#69

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation

Junyan Wang, Zhenhong Sun, Stewart Tan et al.

17

ICCV 2025arXiv:2503.10696

#70

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

16

ICLR 2025arXiv:2410.04814

#71

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

Victor Dheur, Matteo Fontana, Yorick Estievenart et al.

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

dynamical systems reconstructionhierarchical modelingtime series analysismulti-domain learning+4

16

NeurIPS 2025arXiv:2505.12335

#73

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

ai-generated image detectiongenerative model robustnessdeepfake detectionimage forensics+3

15

NeurIPS 2025arXiv:2502.15676

#74

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.

Multimarginal Generative Modeling with Stochastic Interpolants

Michael Albergo, Nicholas Boffi, Michael Lindsey et al.

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.

15

ICLR 2024arXiv:2210.01603

#77

Neural-Symbolic Recursive Machine for Systematic Generalization

Qing Li, Yixin Zhu, Yitao Liang et al.

autoregressive 3d generationpart-based generation3d part discoverycompositional 3d reconstruction+3

#78

AutoPartGen: Autoregressive 3D Part Generation and Discovery

Minghao Chen, Jianyuan Wang, Roman Shapovalov et al.

NeurIPS 2025

ICLR 2025arXiv:2410.04542

#79

Generative Flows on Synthetic Pathway for Drug Design

Seonghwan Seo, Minsu Kim, Tony Shen et al.

generative flow networksdrug designmolecular building blockschemical reaction templates+4

AAAI 2025arXiv:2412.20916

#80

Low-Light Image Enhancement via Generative Perceptual Priors

Han Zhou, Wei Dong, Xiaohong Liu et al.

ICLR 2025arXiv:2412.03881

#81

Weak-to-Strong Generalization Through the Data-Centric Lens

Changho Shin, John Cooper, Frederic Sala

weak-to-strong generalizationoverlap densitydata-centric mechanismsuperalignment+2

AAAI 2024arXiv:2305.15769

#82

MERGE: Fast Private Text Generation

Zi Liang, Pinghui Wang, Ruofei Zhang et al.

private inferencetransformer-based modelsnatural language generationcloud model deployment+4

ECCV 2024arXiv:2406.04551

#83

Improving Geo-diversity of Generated Images with Contextualized Vendi Score Guidance

Reyhane Askari Hemmat, Melissa Hall, Alicia Yi Sun et al.

text-to-image generationlatent diffusion modelsgeo-diversity biasinference time intervention+4

ICLR 2025arXiv:2410.01720

#84

A Periodic Bayesian Flow for Material Generation

Hanlin Wu, Yuxuan Song, Jingjing Gong et al.

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Zeyu Gan, Yong Liu

synthetic data generationlarge language modelspost-training optimizationgeneralization capability+3

ICML 2025arXiv:2504.15266

#86

MaxFusion: Plug&Play Multi-Modal Generation in Text-to-Image Diffusion Models

Nithin Gopalakrishnan Nair, Jeya Maria Jose Valanarasu, Vishal Patel

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.

ICLR 2025arXiv:2406.14302

#88

Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

Patrik Reizinger, Siyuan Guo, Ferenc Huszar et al.

ICLR 2025arXiv:2410.11236

#89

Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals

Nate Gillman, Charles Herrmann, Michael Freeman et al.

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Guiyu Zhang, Huan-ang Gao, Zijian Jiang et al.

deepfake detectiongeneralizable detectionreal appearance modelingface disturbance+2

#91

Real Appearance Modeling for More General Deepfake Detection

Jiahe Tian, Yu Cai, Xi Wang et al.

ECCV 2024

ECCV 2024arXiv:2407.14709

#92

∞-Brush: Controllable Large Image Synthesis with Diffusion Models in Infinite Dimensions

Minh Quan Le, Alexandros Graikos, Srikar Yellapragada et al.

diffusion modelsinfinite dimensionslarge image synthesiscross-attention neural operator+4

ICML 2025arXiv:2503.01103

#93

Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution

Fengyuan Liu, Haochen Luo, Yiming Li et al.

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Kaiwen Zheng, Yongxin Chen, Huayu Chen et al.

ICLR 2025arXiv:2512.25034

#95

Generative Classifiers Avoid Shortcut Solutions

Alexander Li, Ananya Kumar, Deepak Pathak

ICML 2025arXiv:2412.14689

#96

Unbounded: A Generative Infinite Game of Character Life Simulation

Jialu Li, Yuanzhen Li, Neal Wadhwa et al.

How to Synthesize Text Data without Model Collapse?

Xuekai Zhu, Daixuan Cheng, Hengli Li et al.

ICCV 2025arXiv:2503.10406

#98

RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models

Yijing Lin, Mengqi Huang, Shuhan Zhuang et al.

temporal in-context learningconditional frame predictionunified visual generationvideo models+4

AAAI 2024arXiv:2401.02602

#99

Neural Causal Abstractions

Kevin Xia, Elias Bareinboim

causal abstractions theorycausal inference tasksneural causal modelsrepresentation learning+4

ICLR 2025arXiv:2503.00045

#100

Glad: A Streaming Scene Generator for Autonomous Driving

Bin Xie, Yingfei Liu, Tiancai Wang et al.

11