"sample efficiency" Papers
42 papers found
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering
Klaus-Rudolf Kladny, Bernhard Schölkopf, Michael Muehlebach
Direct Alignment with Heterogeneous Preferences
Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.
Learning (Approximately) Equivariant Networks via Constrained Optimization
Andrei Manolache, Luiz Chamon, Mathias Niepert
Mind the GAP: Glimpse-based Active Perception improves generalization and sample efficiency of visual reasoning
Oleh Kolner, Thomas Ortner, Stanisław Woźniak et al.
ObscuraCoder: Powering Efficient Code LM Pre-Training Via Obfuscation Grounding
Indraneil Paul, Haoyi Yang, Goran Glavaš et al.
Off-policy Reinforcement Learning with Model-based Exploration Augmentation
Likun Wang, Xiangteng Zhang, Yinuo Wang et al.
PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment
Daiwei Chen, Yi Chen, Aniket Rege et al.
Sample-Efficient Multi-Round Generative Data Augmentation for Long-Tail Instance Segmentation
Byunghyun Kim, Minyoung Bae, Jae-Gil Lee
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.
Succeed or Learn Slowly: Sample Efficient Off-Policy Reinforcement Learning for Mobile App Control
Georgios Papoudakis, Thomas Coste, Jianye Hao et al.
Time Reversal Symmetry for Efficient Robotic Manipulations in Deep Reinforcement Learning
Yunpeng Jiang, Jianshu Hu, Paul Weng et al.
Advancing DRL Agents in Commercial Fighting Games: Training, Integration, and Agent-Human Alignment
Chen Zhang, Qiang HE, Yuan Zhou et al.
A Unified Linear Programming Framework for Offline Reward Learning from Human Demonstrations and Feedback
Kihyun Kim, Jiawei Zhang, Asuman Ozdaglar et al.
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.
Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning
Zizhao Wang, Caroline Wang, Xuesu Xiao et al.
Contextual Pre-planning on Reward Machine Abstractions for Enhanced Transfer in Deep Reinforcement Learning
Guy Azran, Mohamad H Danesh, Stefano Albrecht et al.
Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation
Michelle Pan, Mariah Schrum, Vivek Myers et al.
Dr. Strategy: Model-Based Generalist Agents with Strategic Dreaming
Hany Hamed, Subin Kim, Dongyeong Kim et al.
EfficientZero V2: Mastering Discrete and Continuous Control with Limited Data
Shengjie Wang, Shaohuai Liu, Weirui Ye et al.
Episodic Return Decomposition by Difference of Implicitly Assigned Sub-trajectory Reward
Haoxin Lin, Hongqiu Wu, Jiaji Zhang et al.
Feasible Reachable Policy Iteration
Shentao Qin, Yujie Yang, Yao Mu et al.
Hieros: Hierarchical Imagination on Structured State Space Sequence World Models
Paul Mattes, Rainer Schlosser, Ralf Herbrich
How Does Goal Relabeling Improve Sample Efficiency?
Sirui Zheng, Chenjia Bai, Zhuoran Yang et al.
Hybrid Inverse Reinforcement Learning
Juntao Ren, Gokul Swamy, Steven Wu et al.
Learning to Play Atari in a World of Tokens
Pranav Agarwal, Sheldon Andrews, Samira Ebrahimi Kahou
Leaving the Nest: Going beyond Local Loss Functions for Predict-Then-Optimize
Sanket Shah, Bryan Wilder, Andrew Perrault et al.
LLM-Empowered State Representation for Reinforcement Learning
Boyuan Wang, Yun Qu, Yuhang Jiang et al.
Model-based Reinforcement Learning for Parameterized Action Spaces
Renhao Zhang, Haotian Fu, Yilin Miao et al.
Offline-Boosted Actor-Critic: Adaptively Blending Optimal Historical Behaviors in Deep Off-Policy RL
Yu Luo, Tianying Ji, Fuchun Sun et al.
Overestimation, Overfitting, and Plasticity in Actor-Critic: the Bitter Lesson of Reinforcement Learning
Michal Nauman, Michał Bortkiewicz, Piotr Milos et al.
Quality-Diversity with Limited Resources
Ren-Jian Wang, Ke Xue, Cong Guan et al.
Reflective Policy Optimization
Yaozhong Gan, yan renye, zhe wu et al.
Reinforcement Learning within Tree Search for Fast Macro Placement
Zijie Geng, Jie Wang, Ziyan Liu et al.
Reward Shaping for Reinforcement Learning with An Assistant Reward Agent
Haozhe Ma, Kuankuan Sima, Thanh Vinh Vo et al.
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics
Yuda Song, Lili Wu, Dylan Foster et al.
Sample-Efficient Multiagent Reinforcement Learning with Reset Replay
Yaodong Yang, Guangyong Chen, Jianye Hao et al.
SAPG: Split and Aggregate Policy Gradients
Jayesh Singla, Ananye Agarwal, Deepak Pathak
Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic
Tianying Ji, Yu Luo, Fuchun Sun et al.
SiT: Symmetry-invariant Transformers for Generalisation in Reinforcement Learning
Matthias Weissenbacher, Rishabh Agarwal, Yoshinobu Kawahara
Symmetric Replay Training: Enhancing Sample Efficiency in Deep Reinforcement Learning for Combinatorial Optimization
Hyeonah Kim, Minsu Kim, Sungsoo Ahn et al.
Uncertainty-Aware Reward-Free Exploration with General Function Approximation
Junkai Zhang, Weitong Zhang, Dongruo Zhou et al.
Value-Evolutionary-Based Reinforcement Learning
Pengyi Li, Jianye Hao, Hongyao Tang et al.