"regret bounds" Papers
15 papers found
Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
Hao Liang, Zhiquan Luo
NeurIPS 2025posterarXiv:2210.14051
18
citations
Contextual Thompson Sampling via Generation of Missing Data
Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.
NeurIPS 2025posterarXiv:2502.07064
2
citations
Generator-Mediated Bandits: Thompson Sampling for GenAI-Powered Adaptive Interventions
Marc Brooks, Gabriel Durham, Kihyuk Hong et al.
NeurIPS 2025posterarXiv:2505.16311
Improved Regret Bounds for Linear Bandits with Heavy-Tailed Rewards
Artin Tajdini, Jonathan Scarlett, Kevin Jamieson
NeurIPS 2025posterarXiv:2506.04775
2
citations
Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data
Qijia He, Minghan Wang, Xutong Liu et al.
NeurIPS 2025poster
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.
NeurIPS 2025oralarXiv:2510.20725
Prediction with expert advice under additive noise
Alankrita Bhatt, Victoria Kostina
NeurIPS 2025poster
Regret Bounds for Adversarial Contextual Bandits with General Function Approximation and Delayed Feedback
Orin Levy, Liad Erez, Alon Peled-Cohen et al.
NeurIPS 2025spotlightarXiv:2510.09127
Robust Satisficing Gaussian Process Bandits Under Adversarial Attacks
Artun Saday, Yaşar Cahit Yıldırım, Cem Tekin
NeurIPS 2025posterarXiv:2506.01625
Statistical Parity with Exponential Weights
Stephen Pasteris, Chris Hicks, Vasilios Mavroudis
NeurIPS 2025poster
Uniform Wrappers: Bridging Concave to Quadratizable Functions in Online Optimization
Mohammad Pedramfar, Christopher Quinn, Vaneet Aggarwal
NeurIPS 2025poster
$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits
Pierre Clavier, Tom Huix, Alain Oliviero Durmus
ICML 2024poster
Combinatorial Multivariant Multi-Armed Bandits with Applications to Episodic Reinforcement Learning and Beyond
Xutong Liu, Siwei Wang, Jinhang Zuo et al.
ICML 2024poster
Leveraging (Biased) Information: Multi-armed Bandits with Offline Data
Wang Chi Cheung, Lixing Lyu
ICML 2024spotlight
Reinforcement Learning and Regret Bounds for Admission Control
Lucas Weber, Ana Busic, Jiamin ZHU
ICML 2024poster