Dale Schuurmans

38
Papers
847
Total Citations

Papers (38)

Bridging the Gap Between Value and Policy Based Reinforcement Learning

NeurIPS 2017arXiv
514
citations

Reward Augmented Maximum Likelihood for Neural Structured Prediction

NeurIPS 2016arXiv
264
citations

Deep Learning Games

NeurIPS 2016
45
citations

Multi-view Matrix Factorization for Linear Dynamical System Estimation

NeurIPS 2017
10
citations

Plastic Learning with Deep Fourier Features

ICLR 2025
9
citations

Improving Large Language Model Planning with Action Sequence Similarity

ICLR 2025arXiv
5
citations

Position: Video as the New Language for Real-World Decision Making

ICML 2024
0
citations

Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation

ICML 2024
0
citations

Semi-Supervised Zero-Shot Classification With Label Representation Learning

ICCV 2015
0
citations

Embedding Inference for Structured Multilabel Prediction

NeurIPS 2015
0
citations

Provable Representation with Efficient Planning for Partially Observable Reinforcement Learning

ICML 2024
0
citations

Escaping the Gravitational Pull of Softmax

NeurIPS 2020
0
citations

Understanding the Effect of Stochasticity in Policy Optimization

NeurIPS 2021
0
citations

Combiner: Full Attention Transformer with Sparse Computation Cost

NeurIPS 2021
0
citations

On the Global Convergence Rates of Decentralized Softmax Gradient Play in Markov Potential Games

NeurIPS 2022
0
citations

The Role of Baselines in Policy Gradient Optimization

NeurIPS 2022
0
citations

Optimal Scaling for Locally Balanced Proposals in Discrete Spaces

NeurIPS 2022
0
citations

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

NeurIPS 2022
0
citations

Chain of Thought Imitation with Procedure Cloning

NeurIPS 2022
0
citations

A Simple Decentralized Cross-Entropy Method

NeurIPS 2022
0
citations

Learning Universal Policies via Text-Guided Video Generation

NeurIPS 2023
0
citations

Ordering-based Conditions for Global Convergence of Policy Gradient Methods

NeurIPS 2023
0
citations

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

NeurIPS 2023
0
citations

DISCS: A Benchmark for Discrete Sampling

NeurIPS 2023
0
citations

Smoothed Action Value Functions for Learning Gaussian Policies

ICML 2018
0
citations

Learning to Generalize from Sparse and Underspecified Rewards

ICML 2019
0
citations

Understanding the Impact of Entropy on Policy Optimization

ICML 2019
0
citations

The Value Function Polytope in Reinforcement Learning

ICML 2019
0
citations

Non-delusional Q-learning and value-iteration

NeurIPS 2018
0
citations

A Geometric Perspective on Optimal Representations for Reinforcement Learning

NeurIPS 2019
0
citations

Exponential Family Estimation via Adversarial Dynamics Embedding

NeurIPS 2019
0
citations

Maximum Entropy Monte-Carlo Planning

NeurIPS 2019
0
citations

Surrogate Objectives for Batch Policy Optimization in One-step Decision Making

NeurIPS 2019
0
citations

Invertible Convolutional Flow

NeurIPS 2019
0
citations

Off-Policy Evaluation via the Regularized Lagrangian

NeurIPS 2020
0
citations

CoinDICE: Off-Policy Confidence Interval Estimation

NeurIPS 2020
0
citations

Learning Discrete Energy-based Models via Auxiliary-variable Local Exploration

NeurIPS 2020
0
citations

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

NeurIPS 2020
0
citations