"synthetic data generation" Papers

39 papers found

Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks

Hailong Guo, Bohan Zeng, Yiren Song et al.

ICCV 2025posterarXiv:2501.15891
30
citations

Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation

Tobias Leemann, Periklis Petridis, Giuseppe Vietri et al.

ICLR 2025posterarXiv:2410.03461
3
citations

ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions

Yue Huang, Zhengzhe Jiang, Xiaonan Luo et al.

NeurIPS 2025poster

Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling

Nannan Li, Kevin Shih, Bryan A. Plummer

CVPR 2025posterarXiv:2501.04666
1
citations

Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment

Jiayi Guo, Zhao Junhao, Chaoqun Du et al.

CVPR 2025posterarXiv:2406.04295

GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs

Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev

ICLR 2025poster
5
citations

GRIP: A Graph-Based Reasoning Instruction Producer

Jiankang Wang, Jianjun Xu, Xiaorui Wang et al.

NeurIPS 2025posterarXiv:2412.08864
2
citations

LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization

Zhenpeng Huang, Jiaqi Li, zihan jia et al.

NeurIPS 2025poster

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025posterarXiv:2503.16096
4
citations

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025posterarXiv:2410.08196
27
citations

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Ankit Dhiman, Manan Shah, R. Venkatesh Babu

CVPR 2025posterarXiv:2504.15397
1
citations

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Jaehun Jung, Seungju Han, Ximing Lu et al.

NeurIPS 2025spotlightarXiv:2505.20161
15
citations

Rethinking the Role of Verbatim Memorization in LLM Privacy

Tom Sander, Bargav Jayaraman, Mark Ibrahim et al.

NeurIPS 2025poster

RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation

Tianyi Yan, Wencheng Han, xia zhou et al.

NeurIPS 2025posterarXiv:2509.16500

Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation

Linda He, Jue Wang, Maurice Weber et al.

ICLR 2025posterarXiv:2504.12637
2
citations

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Bin Wu, Wuxuan Shi, Jinqiao Wang et al.

CVPR 2025posterarXiv:2503.04229
13
citations

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

Yibo Wang, Hai-Long Sun, Guangda Huzhang et al.

NeurIPS 2025posterarXiv:2601.08198
4
citations

V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation

Hanyue Lou, Jinxiu Liang, Minggui Teng et al.

NeurIPS 2025oralarXiv:2505.16797
2
citations

Valid Inference with Imperfect Synthetic Data

Yewon Byun, Shantanu Gupta, Zachary Lipton et al.

NeurIPS 2025posterarXiv:2508.06635

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Zi Liang, Qingqing Ye, Xuan Liu et al.

NeurIPS 2025spotlight

3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views

Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng et al.

ECCV 2024posterarXiv:2212.02997
13
citations

CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources

Sikha Pentyala, Mayana Pereira, Martine De Cock

ICML 2024poster

ConSequence: Synthesizing Logically Constrained Sequences for Electronic Health Record Generation

Brandon Theodorou, Shrusti Jain, Cao Xiao et al.

AAAI 2024paperarXiv:2312.05964
2
citations

Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes

Nabeel Seedat, Nicolas Huynh, Boris van Breugel et al.

ICML 2024poster

CuTS: Customizable Tabular Synthetic Data Generation

Mark Vero, Mislav Balunovic, Martin Vechev

ICML 2024poster

Data-to-Model Distillation: Data-Efficient Learning Framework

Ahmad Sajedi, Samir Khaki, Lucy Z. Liu et al.

ECCV 2024posterarXiv:2411.12841
3
citations

Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model

Junghun Cha, Ali Haider, Seoyun Yang et al.

AAAI 2024paperarXiv:2402.05350

Differentially Private Sum-Product Networks

Xenia Heilmann, Mattia Cerrato, Ernst Althaus

ICML 2024poster

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Chulin Xie, Zinan Lin, Arturs Backurs et al.

ICML 2024spotlight

DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation

Xiaobin Hu, Xu Peng, Donghao Luo et al.

ECCV 2024posterarXiv:2403.06168
13
citations

FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering

Zhenyu Li, Sunqi Fan, Yu Gu et al.

AAAI 2024paperarXiv:2308.12060
122
citations

GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements

Alexander Havrilla, Sharath Chandra Raparthy, Christoforos Nalmpantis et al.

ICML 2024poster

Image Captioning with Multi-Context Synthetic Data

AAAI 2024paperarXiv:2305.18072

Position: Will we run out of data? Limits of LLM scaling based on human-generated data

Pablo Villalobos, Anson Ho, Jaime Sevilla et al.

ICML 2024poster

PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs

Charlie Hou, Akshat Shrivastava, Hongyuan Zhan et al.

ICML 2024poster

Sharpness-Aware Data Generation for Zero-shot Quantization

Hoang Dung, Cuong Pham, Trung Le et al.

ICML 2024poster

Speech Self-Supervised Learning Using Diffusion Model Synthetic Data

Heting Gao, Kaizhi Qian, Junrui Ni et al.

ICML 2024poster

UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues

Vandad Davoodnia, Saeed Ghorbani, Marc-André Carbonneau et al.

ECCV 2024posterarXiv:2404.14634
3
citations

What is Dataset Distillation Learning?

William Yang, Ye Zhu, Zhiwei Deng et al.

ICML 2024poster