Dale Schuurmans
38
Papers
847
Total Citations
Papers (38)
Bridging the Gap Between Value and Policy Based Reinforcement Learning
NeurIPS 2017arXiv
514
citations
Reward Augmented Maximum Likelihood for Neural Structured Prediction
NeurIPS 2016arXiv
264
citations
Deep Learning Games
NeurIPS 2016
45
citations
Multi-view Matrix Factorization for Linear Dynamical System Estimation
NeurIPS 2017
10
citations
Plastic Learning with Deep Fourier Features
ICLR 2025
9
citations
Improving Large Language Model Planning with Action Sequence Similarity
ICLR 2025arXiv
5
citations
Position: Video as the New Language for Real-World Decision Making
ICML 2024
0
citations
Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation
ICML 2024
0
citations
Semi-Supervised Zero-Shot Classification With Label Representation Learning
ICCV 2015
0
citations
Embedding Inference for Structured Multilabel Prediction
NeurIPS 2015
0
citations
Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning
ICML 2024
0
citations
Escaping the Gravitational Pull of Softmax
NeurIPS 2020
0
citations
Understanding the Effect of Stochasticity in Policy Optimization
NeurIPS 2021
0
citations
Combiner: Full Attention Transformer with Sparse Computation Cost
NeurIPS 2021
0
citations
On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games
NeurIPS 2022
0
citations
The Role of Baselines in Policy Gradient Optimization
NeurIPS 2022
0
citations
Optimal Scaling for Locally Balanced Proposals in Discrete Spaces
NeurIPS 2022
0
citations
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
NeurIPS 2022
0
citations
Chain of Thought Imitation with Procedure Cloning
NeurIPS 2022
0
citations
A Simple Decentralized Cross-Entropy Method
NeurIPS 2022
0
citations
Learning Universal Policies via Text-Guided Video Generation
NeurIPS 2023
0
citations
Ordering-based Conditions for Global Convergence of Policy Gradient Methods
NeurIPS 2023
0
citations
Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off
NeurIPS 2023
0
citations
DISCS: A Benchmark for Discrete Sampling
NeurIPS 2023
0
citations
Smoothed Action Value Functions for Learning Gaussian Policies
ICML 2018
0
citations
Learning to Generalize from Sparse and Underspecified Rewards
ICML 2019
0
citations
Understanding the Impact of Entropy on Policy Optimization
ICML 2019
0
citations
The Value Function Polytope in Reinforcement Learning
ICML 2019
0
citations
Non-delusional Q-learning and value-iteration
NeurIPS 2018
0
citations
A Geometric Perspective on Optimal Representations for Reinforcement Learning
NeurIPS 2019
0
citations
Exponential Family Estimation via Adversarial Dynamics Embedding
NeurIPS 2019
0
citations
Maximum Entropy Monte-Carlo Planning
NeurIPS 2019
0
citations
Surrogate Objectives for Batch Policy Optimization in One-step Decision Making
NeurIPS 2019
0
citations
Invertible Convolutional Flow
NeurIPS 2019
0
citations
Off-Policy Evaluation via the Regularized Lagrangian
NeurIPS 2020
0
citations
CoinDICE: Off-Policy Confidence Interval Estimation
NeurIPS 2020
0
citations
Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration
NeurIPS 2020
0
citations
A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs
NeurIPS 2020
0
citations