Wen Sun

36
Papers
175
Total Citations

Papers (36)

Predictive-State Decoders: Encoding the Future into Recurrent Networks

NeurIPS 2017arXiv
43
citations

Provable Offline Preference-Based Reinforcement Learning

ICLR 2024
39
citations

Making RL with Preference-based Feedback Efficient via Randomization

ICLR 2024
37
citations

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

ICLR 2025
14
citations

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

ICLR 2025
10
citations

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

NeurIPS 2025
10
citations

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

ICLR 2024
8
citations

Value-Guided Search for Efficient Chain-of-Thought Reasoning

NeurIPS 2025
7
citations

On Speeding Up Language Model Evaluation

ICLR 2025
4
citations

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

ICLR 2025
3
citations

Learning To Detect Mobile Objects From LiDAR Scans Without Labels

CVPR 2022arXiv
0
citations

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

ICML 2024
0
citations

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems

NeurIPS 2022
0
citations

The Benefits of Being Distributional: Small-Loss Bounds for Reinforcement Learning

NeurIPS 2023
0
citations

Contextual Bandits and Imitation Learning with Preference-Based Active Queries

NeurIPS 2023
0
citations

Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage

NeurIPS 2023
0
citations

Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery

NeurIPS 2023
0
citations

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

NeurIPS 2023
0
citations

Selective Sampling and Imitation Learning via Online Regression

NeurIPS 2023
0
citations

Learning to Filter with Predictive State Inference Machines

ICML 2016
0
citations

Safety-Aware Algorithms for Adversarial Contextual Bandit

ICML 2017
0
citations

Deeply AggreVaTeD: Differentiable Imitation Learning for Sequential Prediction

ICML 2017
0
citations

Recurrent Predictive State Policy Networks

ICML 2018
0
citations

Contextual Memory Trees

ICML 2019
0
citations

Provably Efficient Imitation Learning from Observation Alone

ICML 2019
0
citations

Dual Policy Iteration

NeurIPS 2018
0
citations

Optimal Sketching for Kronecker Product Regression and Low Rank Approximation

NeurIPS 2019
0
citations

Policy Poisoning in Batch Reinforcement Learning and Control

NeurIPS 2019
0
citations

Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates

NeurIPS 2020
0
citations

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

NeurIPS 2020
0
citations

Learning the Linear Quadratic Regulator from Nonlinear Observations

NeurIPS 2020
0
citations

Information Theoretic Regret Bounds for Online Nonlinear Control

NeurIPS 2020
0
citations

Constrained episodic reinforcement learning in concave-convex and knapsack settings

NeurIPS 2020
0
citations

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

NeurIPS 2020
0
citations

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

NeurIPS 2021
0
citations

MobILE: Model-Based Imitation Learning From Observation Alone

NeurIPS 2021
0
citations