"synthetic data generation" Papers

39 papers found

Filters:synthetic data generation Clear all

Conference

AAAI 2025 (3,028)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NeurIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,140)oral (1,594)spotlight (1,421)highlight (975)

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

Hailong Guo, Bohan Zeng, Yiren Song et al.

ICCV 2025posterarXiv:2501.15891

citations

Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation

Tobias Leemann, Periklis Petridis, Giuseppe Vietri et al.

ICLR 2025posterarXiv:2410.03461

citations

ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions

Yue Huang, Zhengzhe Jiang, Xiaonan Luo et al.

NeurIPS 2025poster

Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling

Nannan Li, Kevin Shih, Bryan A. Plummer

CVPR 2025posterarXiv:2501.04666

citations

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Jiayi Guo, Zhao Junhao, Chaoqun Du et al.

CVPR 2025posterarXiv:2406.04295

GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs

Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev

ICLR 2025poster

citations

GRIP: A Graph-Based Reasoning Instruction Producer

Jiankang Wang, Jianjun Xu, Xiaorui Wang et al.

NeurIPS 2025posterarXiv:2412.08864

citations

LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization

Zhenpeng Huang, Jiaqi Li, zihan jia et al.

NeurIPS 2025poster

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025posterarXiv:2503.16096

citations

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025posterarXiv:2410.08196

citations

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Ankit Dhiman, Manan Shah, R. Venkatesh Babu

CVPR 2025posterarXiv:2504.15397

citations

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Jaehun Jung, Seungju Han, Ximing Lu et al.

NeurIPS 2025spotlightarXiv:2505.20161

citations

Rethinking the Role of Verbatim Memorization in LLM Privacy

Tom Sander, Bargav Jayaraman, Mark Ibrahim et al.

NeurIPS 2025poster

RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation

Tianyi Yan, Wencheng Han, xia zhou et al.

NeurIPS 2025posterarXiv:2509.16500

Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation

Linda He, Jue Wang, Maurice Weber et al.

ICLR 2025posterarXiv:2504.12637

citations

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Bin Wu, Wuxuan Shi, Jinqiao Wang et al.

CVPR 2025posterarXiv:2503.04229

citations

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

Yibo Wang, Hai-Long Sun, Guangda Huzhang et al.

NeurIPS 2025posterarXiv:2601.08198

citations

V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Hanyue Lou, Jinxiu Liang, Minggui Teng et al.

NeurIPS 2025oralarXiv:2505.16797

citations

Valid Inference with Imperfect Synthetic Data

Yewon Byun, Shantanu Gupta, Zachary Lipton et al.

NeurIPS 2025posterarXiv:2508.06635

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Zi Liang, Qingqing Ye, Xuan Liu et al.

NeurIPS 2025spotlight

3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views

Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng et al.

ECCV 2024posterarXiv:2212.02997

citations

CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources

Sikha Pentyala, Mayana Pereira, Martine De Cock

ICML 2024poster

ConSequence: Synthesizing Logically Constrained Sequences for Electronic Health Record Generation

Brandon Theodorou, Shrusti Jain, Cao Xiao et al.

AAAI 2024paperarXiv:2312.05964

citations

Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes

Nabeel Seedat, Nicolas Huynh, Boris van Breugel et al.

ICML 2024poster

CuTS: Customizable Tabular Synthetic Data Generation

Mark Vero, Mislav Balunovic, Martin Vechev

ICML 2024poster

Data-to-Model Distillation: Data-Efficient Learning Framework

Ahmad Sajedi, Samir Khaki, Lucy Z. Liu et al.

ECCV 2024posterarXiv:2411.12841

citations

Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model

Junghun Cha, Ali Haider, Seoyun Yang et al.

AAAI 2024paperarXiv:2402.05350

Differentially Private Sum-Product Networks

Xenia Heilmann, Mattia Cerrato, Ernst Althaus

ICML 2024poster

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Chulin Xie, Zinan Lin, Arturs Backurs et al.

ICML 2024spotlight

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation

Xiaobin Hu, Xu Peng, Donghao Luo et al.

ECCV 2024posterarXiv:2403.06168

citations

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Zhenyu Li, Sunqi Fan, Yu Gu et al.

AAAI 2024paperarXiv:2308.12060

122

citations

GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

Alexander Havrilla, Sharath Chandra Raparthy, Christoforos Nalmpantis et al.

ICML 2024poster

Image Captioning with Multi-Context Synthetic Data

AAAI 2024paperarXiv:2305.18072

Position: Will we run out of data? Limits of LLM scaling based on human-generated data

Pablo Villalobos, Anson Ho, Jaime Sevilla et al.

ICML 2024poster

PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs

Charlie Hou, Akshat Shrivastava, Hongyuan Zhan et al.

ICML 2024poster

Sharpness-Aware Data Generation for Zero-shot Quantization

Hoang Dung, Cuong Pham, Trung Le et al.

ICML 2024poster

Speech Self-Supervised Learning Using Diffusion Model Synthetic Data

Heting Gao, Kaizhi Qian, Junrui Ni et al.

ICML 2024poster

UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues

Vandad Davoodnia, Saeed Ghorbani, Marc-André Carbonneau et al.

ECCV 2024posterarXiv:2404.14634

citations

What is Dataset Distillation Learning?

William Yang, Ye Zhu, Zhiwei Deng et al.

ICML 2024poster