Mohammad Ghavamzadeh
26
Papers
264
Total Citations
Papers (26)
Safe Policy Improvement by Minimizing Robust Baseline Regret
NeurIPS 2016arXiv
140
citations
Conservative Contextual Linear Bandits
NeurIPS 2017arXiv
105
citations
Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models
NeurIPS 2025
19
citations
Bayesian Regret Minimization in Offline Bandits
ICML 2024
0
citations
Policy Gradient for Coherent Risk Measures
NeurIPS 2015
0
citations
Adaptive Sampling for Minimax Fair Classification
NeurIPS 2021
0
citations
Private and Communication-Efficient Algorithms for Entropy Estimation
NeurIPS 2022
0
citations
Robust Reinforcement Learning using Offline Data
NeurIPS 2022
0
citations
Efficient Risk-Averse Reinforcement Learning
NeurIPS 2022
0
citations
Operator Splitting Value Iteration
NeurIPS 2022
0
citations
Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
NeurIPS 2023
0
citations
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
NeurIPS 2023
0
citations
On Dynamic Programming Decompositions of Static Risk Measures in Markov Decision Processes
NeurIPS 2023
0
citations
DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models
NeurIPS 2023
0
citations
High Confidence Policy Improvement
ICML 2015
0
citations
Active Learning for Accurate Estimation of Linear Models
ICML 2017
0
citations
Bottleneck Conditional Density Estimation
ICML 2017
0
citations
Model-Independent Online Learning for Influence Maximization
ICML 2017
0
citations
Online Learning to Rank in Stochastic Click Models
ICML 2017
0
citations
Path Consistency Learning in Tsallis Entropy Regularized MDPs
ICML 2018
0
citations
More Robust Doubly Robust Off-policy Evaluation
ICML 2018
0
citations
Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits
ICML 2019
0
citations
A Block Coordinate Ascent Algorithm for Mean-Variance Optimization
NeurIPS 2018
0
citations
A Lyapunov-based Approach to Safe Reinforcement Learning
NeurIPS 2018
0
citations
Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies
NeurIPS 2019
0
citations
Online Planning with Lookahead Policies
NeurIPS 2020
0
citations