"synthetic data generation" Papers
104 papers found • Page 2 of 3
Conference
Rethinking the Role of Verbatim Memorization in LLM Privacy
Tom Sander, Bargav Jayaraman, Mark Ibrahim et al.
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan, Wencheng Han, xia zhou et al.
RoboTron-Sim: Improving Real-World Driving via Simulated Hard-Case
Baihui Xiao, Chengjian Feng, Zhijian Huang et al.
ROSE: Remove Objects with Side Effects in Videos
Chenxuan Miao, Yutong Feng, Jianshu Zeng et al.
RUAGO: Effective and Practical Retain-Free Unlearning via Adversarial Attack and OOD Generator
SangYong Lee, Sangjun Chung, Simon Woo
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Linda He, Jue Wang, Maurice Weber et al.
SimpleStrat: Diversifying Language Model Generation with Stratification
Justin Wong, Yury Orlovskiy, Alexander Shypula et al.
SK-VQA: Synthetic Knowledge Generation at Scale for Training Context-Augmented Multimodal LLMs
Xin Su, Man Luo, Kris Pan et al.
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li, Wendi Sang, Chang Zeng et al.
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Bootstrapping
Pu Yang, Yunzhen Feng, Ziyuan Chen et al.
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning
Peixian Ma, Xialie Zhuang, Chengjin Xu et al.
SynFER: Towards Boosting Facial Expression Recognition with Synthetic Data
Xilin He, Cheng Luo, Xiaole Xian et al.
Synthetic Data is an Elegant GIFT for Continual Vision-Language Models
Bin Wu, Wuxuan Shi, Jinqiao Wang et al.
Synthetic Series-Symbol Data Generation for Time Series Foundation Models
Wenxuan Wang, Kai Wu, yujian li et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Task-Specific Zero-shot Quantization-Aware Training for Object Detection
Changhao Li, Xinrui Chen, Ji Wang et al.
ToolACE: Winning the Points of LLM Function Calling
Weiwen Liu, Xu Huang, Xingshan Zeng et al.
Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective
Zeyu Gan, Yong Liu
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
Yibo Wang, Hai-Long Sun, Guangda Huzhang et al.
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
Hanyue Lou, Jinxiu Liang, Minggui Teng et al.
Valid Inference with Imperfect Synthetic Data
Yewon Byun, Shantanu Gupta, Zachary Lipton et al.
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang, Qingqing Ye, Xuan Liu et al.
VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning
Wenhao Li, Qiangchang Wang, Xianjing Meng et al.
Zero-Shot Monocular Scene Flow Estimation in the Wild
Yiqing Liang, Abhishek Badki, Hang Su et al.
3DGazeNet: Generalizing Gaze Estimation with Weak Supervision from Synthetic Views
Evangelos Ververas, Polydefkis Gkagkos, Jiankang Deng et al.
AlignDiff: Aligning Diffusion Models for General Few-Shot Segmentation
Ri-Zhao Qiu, Yu-Xiong Wang, Kris Hauser
Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection?
Rosario Leonardi, Antonino Furnari, Francesco Ragusa et al.
CaPS: Collaborative and Private Synthetic Data Generation from Distributed Sources
Sikha Pentyala, Mayana Pereira, Martine De Cock
ConSequence: Synthesizing Logically Constrained Sequences for Electronic Health Record Generation
Brandon Theodorou, Shrusti Jain, Cao Xiao et al.
Curated LLM: Synergy of LLMs and Data Curation for tabular augmentation in low-data regimes
Nabeel Seedat, Nicolas Huynh, Boris van Breugel et al.
CuTS: Customizable Tabular Synthetic Data Generation
Mark Vero, Mislav Balunovic, Martin Vechev
Data-to-Model Distillation: Data-Efficient Learning Framework
Ahmad Sajedi, Samir Khaki, Lucy Z. Liu et al.
Descanning: From Scanned to the Original Images with a Color Correction Diffusion Model
Junghun Cha, Ali Haider, Seoyun Yang et al.
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
Yibo Wang, Ruiyuan Gao, Kai Chen et al.
Differentially Private Sum-Product Networks
Xenia Heilmann, Mattia Cerrato, Ernst Althaus
Differentially Private Synthetic Data via Foundation Model APIs 2: Text
Chulin Xie, Zinan Lin, Arturs Backurs et al.
DiffuMatting: Synthesizing Arbitrary Objects with Matting-level Annotation
Xiaobin Hu, Xu Peng, Donghao Luo et al.
Distill Gold from Massive Ores: Bi-level Data Pruning towards Efficient Dataset Distillation
YUE XU, Yong-Lu Li, Kaitong Cui et al.
DreamStruct: Understanding Slides and User Interfaces via Synthetic Data Generation
Yi-Hao Peng, Faria Huq, Yue Jiang et al.
EgoGen: An Egocentric Synthetic Data Generator
Gen Li, Kaifeng Zhao, Siwei Zhang et al.
FlexKBQA: A Flexible LLM-Powered Framework for Few-Shot Knowledge Base Question Answering
Zhenyu Li, Sunqi Fan, Yu Gu et al.
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alexander Havrilla, Sharath Chandra Raparthy, Christoforos Nalmpantis et al.
Human Pose Recognition via Occlusion-Preserving Abstract Images
Saad Manzur, Wayne B Hayes
Image Captioning with Multi-Context Synthetic Data
Feipeng Ma, Y. Zhou, Fengyun Rao et al.
PEGASUS: Personalized Generative 3D Avatars with Composable Attributes
Hyunsoo Cha, Byungjun Kim, Hanbyul Joo
Position: Will we run out of data? Limits of LLM scaling based on human-generated data
Pablo Villalobos, Anson Ho, Jaime Sevilla et al.
PrE-Text: Training Language Models on Private Federated Data in the Age of LLMs
Charlie Hou, Akshat Shrivastava, Hongyuan Zhan et al.
Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.
Sharpness-Aware Data Generation for Zero-shot Quantization
Hoang Dung, Cuong Pham, Trung Le et al.