2025 "training efficiency" Papers

19 papers found

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025posterarXiv:2410.11215
15
citations

Angles Don’t Lie: Unlocking Training‑Efficient RL Through the Model’s Own Signals

Qinsi Wang, Jinghan Ke, Hancheng Ye et al.

NEURIPS 2025spotlight

Bifrost-1: Bridging Multimodal LLMs and Diffusion Models with Patch-level CLIP Latents

Han Lin, Jaemin Cho, Amir Zadeh et al.

NEURIPS 2025posterarXiv:2508.05954
6
citations

Cut Your Losses in Large-Vocabulary Language Models

Erik Wijmans, Brody Huval, Alexander Hertzberg et al.

ICLR 2025posterarXiv:2411.09009
19
citations

DataRater: Meta-Learned Dataset Curation

Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.

NEURIPS 2025posterarXiv:2505.17895
7
citations

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Taishi Nakamura, Takuya Akiba, Kazuki Fujii et al.

ICLR 2025posterarXiv:2502.19261
8
citations

Efficient Representativeness-Aware Coreset Selection

Zihao Cheng, Binrui Wu, Zhiwei Li et al.

NEURIPS 2025poster

Faster and Better 3D Splatting via Group Training

Chengbo Wang, Guozheng Ma, Yizhen Lao et al.

ICCV 2025posterarXiv:2412.07608
3
citations

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Yiqin Yang, Quanwei Wang, Chenghao Li et al.

ICLR 2025posterarXiv:2502.18955

Improved Noise Schedule for Diffusion Training

Tiankai Hang, Shuyang Gu, Jianmin Bao et al.

ICCV 2025posterarXiv:2407.03297

Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better

Enshu Liu, Junyi Zhu, Zinan Lin et al.

ICLR 2025posterarXiv:2404.02241
6
citations

Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

Fu-Yun Wang, Ling Yang, Zhaoyang Huang et al.

ICLR 2025posterarXiv:2410.07303
47
citations

Reinforcement Learning-Guided Data Selection via Redundancy Assessment

Suorong Yang, Peijia Li, Furao Shen et al.

ICCV 2025posterarXiv:2506.21037
1
citations

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.

ICLR 2025posterarXiv:2410.06940
308
citations

Representation Entanglement for Generation: Training Diffusion Transformers Is Much Easier Than You Think

Ge Wu, Shen Zhang, Ruijing Shi et al.

NEURIPS 2025oralarXiv:2507.01467
27
citations

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang et al.

NEURIPS 2025posterarXiv:2503.15450
2
citations

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

ICLR 2025posterarXiv:2502.15938
22
citations

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025posterarXiv:2501.04765
10
citations

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025posterarXiv:2506.01317
6
citations