2025 "synthetic data generation" Papers
26 papers found
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks
Hailong Guo, Bohan Zeng, Yiren Song et al.
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation
Tobias Leemann, Periklis Petridis, Giuseppe Vietri et al.
ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions
Yue Huang, Zhengzhe Jiang, Xiaonan Luo et al.
DataGen: Unified Synthetic Dataset Generation via Large Language Models
Yue Huang, Siyuan Wu, Chujie Gao et al.
ECHOPulse: ECG Controlled Echocardio-gram Video Generation
Yiwei Li, Sekeun Kim, Zihao Wu et al.
Enhancing Dataset Distillation via Non-Critical Region Refinement
Minh-Tuan Tran, Trung Le, Xuan-May Le et al.
Enhancing Virtual Try-On with Synthetic Pairs and Error-Aware Noise Scheduling
Nannan Li, Kevin Shih, Bryan A. Plummer
Everything to the Synthetic: Diffusion-driven Test-time Adaptation via Synthetic-Domain Alignment
Jiayi Guo, Zhao Junhao, Chaoqun Du et al.
Federated Class-Incremental Learning: A Hybrid Approach Using Latent Exemplars and Data-Free Techniques to Address Local and Global Forgetting
Milad Khademi Nori, IL-MIN KIM, Guanghui Wang
GLoRa: A Benchmark to Evaluate the Ability to Learn Long-Range Dependencies in Graphs
Dongzhuoran Zhou, Evgeny Kharlamov, Egor Kostylev
GRIP: A Graph-Based Reasoning Instruction Producer
Jiankang Wang, Jianjun Xu, Xiaorui Wang et al.
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization
Zhenpeng Huang, Jiaqi Li, zihan jia et al.
MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures
Lucas Morin, Valery Weber, Ahmed Nassar et al.
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code
Zimu Lu, Aojun Zhou, Ke Wang et al.
MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World
Ankit Dhiman, Manan Shah, R. Venkatesh Babu
PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding
Jang Hyun Cho, Andrea Madotto, Effrosyni Mavroudi et al.
Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning
Jaehun Jung, Seungju Han, Ximing Lu et al.
Rethinking the Role of Verbatim Memorization in LLM Privacy
Tom Sander, Bargav Jayaraman, Mark Ibrahim et al.
RLGF: Reinforcement Learning with Geometric Feedback for Autonomous Driving Video Generation
Tianyi Yan, Wencheng Han, xia zhou et al.
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Linda He, Jue Wang, Maurice Weber et al.
Synthetic Data is an Elegant GIFT for Continual Vision-Language Models
Bin Wu, Wuxuan Shi, Jinqiao Wang et al.
Synthetic Series-Symbol Data Generation for Time Series Foundation Models
Wenxuan Wang, Kai Wu, yujian li et al.
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
Yibo Wang, Hai-Long Sun, Guangda Huzhang et al.
V2V: Scaling Event-Based Vision through Efficient Video-to-Voxel Simulation
Hanyue Lou, Jinxiu Liang, Minggui Teng et al.
Valid Inference with Imperfect Synthetic Data
Yewon Byun, Shantanu Gupta, Zachary Lipton et al.
Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data
Zi Liang, Qingqing Ye, Xuan Liu et al.