Alekh Agarwal
33
Papers
467
Total Citations
Papers (33)
Off-policy evaluation for slate recommendation
NeurIPS 2017arXiv
244
citations
Efficient Second Order Online Learning by Sketching
NeurIPS 2016arXiv
100
citations
Theoretical guarantees on the best-of-n alignment policy
ICML 2025
89
citations
Contextual semibandits via supervised learning oracles
NeurIPS 2016arXiv
22
citations
PAC Reinforcement Learning with Rich Observations
NeurIPS 2016arXiv
9
citations
Design Considerations in Offline Preference-based RL
ICML 2025
3
citations
The Non-linear $F$-Design and Applications to Interactive Learning
ICML 2024
0
citations
Efficient and Parsimonious Agnostic Active Learning
NeurIPS 2015
0
citations
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
ICML 2024
0
citations
Fast Convergence of Regularized Learning in Games
NeurIPS 2015
0
citations
More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning
ICML 2024
0
citations
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
NeurIPS 2023
0
citations
A Lower Bound for the Optimization of Finite Sums
ICML 2015
0
citations
Learning to Search Better than Your Teacher
ICML 2015
0
citations
Contextual Decision Processes with low Bellman rank are PAC-Learnable
ICML 2017
0
citations
Active Learning for Cost-Sensitive Classification
ICML 2017
0
citations
Optimal and Adaptive Off-policy Evaluation in Contextual Bandits
ICML 2017
0
citations
A Reductions Approach to Fair Classification
ICML 2018
0
citations
Practical Contextual Bandits with Regression Oracles
ICML 2018arXiv
0
citations
Hierarchical Imitation and Reinforcement Learning
ICML 2018
0
citations
Fair Regression: Quantitative Definitions and Reduction-Based Algorithms
ICML 2019
0
citations
Provably efficient RL with Rich Observations via Latent State Decoding
ICML 2019
0
citations
Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback
ICML 2019
0
citations
On Oracle-Efficient PAC RL with Rich Observations
NeurIPS 2018
0
citations
Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting
NeurIPS 2019
0
citations
Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration
NeurIPS 2020
0
citations
Policy Improvement via Imitation of Multiple Oracles
NeurIPS 2020
0
citations
Safe Reinforcement Learning via Curriculum Induction
NeurIPS 2020
0
citations
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
NeurIPS 2020
0
citations
FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
NeurIPS 2020
0
citations
Bellman-consistent Pessimism for Offline Reinforcement Learning
NeurIPS 2021
0
citations
On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL
NeurIPS 2022
0
citations
Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity
NeurIPS 2022
0
citations