"off-policy training" Papers
3 papers found
Transition Path Sampling with Improved Off-Policy Training of Diffusion Path Samplers
Kiyoung Seong, Seonghyun Park, Seonghwan Kim et al.
ICLR 2025posterarXiv:2405.19961
9
citations
DistiLLM: Towards Streamlined Distillation for Large Language Models
Jongwoo Ko, Sungnyun Kim, Tianyi Chen et al.
ICML 2024poster
Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing
Jinmin He, Kai Li, Yifan Zang et al.
AAAI 2024paperarXiv:2312.14472
10
citations