Alekh Agarwal

33
Papers
467
Total Citations

Papers (33)

Off-policy evaluation for slate recommendation

NeurIPS 2017arXiv
244
citations

Efficient Second Order Online Learning by Sketching

NeurIPS 2016arXiv
100
citations

Theoretical guarantees on the best-of-n alignment policy

ICML 2025
89
citations

Contextual semibandits via supervised learning oracles

NeurIPS 2016arXiv
22
citations

PAC Reinforcement Learning with Rich Observations

NeurIPS 2016arXiv
9
citations

Design Considerations in Offline Preference-based RL

ICML 2025
3
citations

The Non-linear $F$-Design and Applications to Interactive Learning

ICML 2024
0
citations

Efficient and Parsimonious Agnostic Active Learning

NeurIPS 2015
0
citations

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

ICML 2024
0
citations

Fast Convergence of Regularized Learning in Games

NeurIPS 2015
0
citations

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

ICML 2024
0
citations

Ordering-based Conditions for Global Convergence of Policy Gradient Methods

NeurIPS 2023
0
citations

A Lower Bound for the Optimization of Finite Sums

ICML 2015
0
citations

Learning to Search Better than Your Teacher

ICML 2015
0
citations

Contextual Decision Processes with low Bellman rank are PAC-Learnable

ICML 2017
0
citations

Active Learning for Cost-Sensitive Classification

ICML 2017
0
citations

Optimal and Adaptive Off-policy Evaluation in Contextual Bandits

ICML 2017
0
citations

A Reductions Approach to Fair Classification

ICML 2018
0
citations

Practical Contextual Bandits with Regression Oracles

ICML 2018arXiv
0
citations

Hierarchical Imitation and Reinforcement Learning

ICML 2018
0
citations

Fair Regression: Quantitative Definitions and Reduction-Based Algorithms

ICML 2019
0
citations

Provably efficient RL with Rich Observations via Latent State Decoding

ICML 2019
0
citations

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

ICML 2019
0
citations

On Oracle-Efficient PAC RL with Rich Observations

NeurIPS 2018
0
citations

Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting

NeurIPS 2019
0
citations

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

NeurIPS 2020
0
citations

Policy Improvement via Imitation of Multiple Oracles

NeurIPS 2020
0
citations

Safe Reinforcement Learning via Curriculum Induction

NeurIPS 2020
0
citations

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

NeurIPS 2020
0
citations

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

NeurIPS 2020
0
citations

Bellman-consistent Pessimism for Offline Reinforcement Learning

NeurIPS 2021
0
citations

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

NeurIPS 2022
0
citations

Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity

NeurIPS 2022
0
citations