Poster "contextual bandits" Papers

23 papers found

An Improved Algorithm for Adversarial Linear Contextual Bandits via Reduction

Tim van Erven, Jack Mayo, Julia Olkhovskaya et al.

NeurIPS 2025posterarXiv:2508.11931

Contextual Thompson Sampling via Generation of Missing Data

Kelly W Zhang, Tianhui Cai, Hongseok Namkoong et al.

NeurIPS 2025posterarXiv:2502.07064
2
citations

Cross-Domain Off-Policy Evaluation and Learning for Contextual Bandits

Yuta Natsubori, Masataka Ushiku, Yuta Saito

ICLR 2025poster

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NeurIPS 2025posterarXiv:2507.15290
1
citations

MultiScale Contextual Bandits for Long Term Objectives

Richa Rastogi, Yuta Saito, Thorsten Joachims

NeurIPS 2025posterarXiv:2503.17674

Second Order Bounds for Contextual Bandits with Function Approximation

Aldo Pacchiano

ICLR 2025posterarXiv:2409.16197
7
citations

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Heyang Zhao, Chenlu Ye, Quanquan Gu et al.

NeurIPS 2025posterarXiv:2411.04625
14
citations

Statistical Parity with Exponential Weights

Stephen Pasteris, Chris Hicks, Vasilios Mavroudis

NeurIPS 2025poster

True Impact of Cascade Length in Contextual Cascading Bandits

Hyun-jun Choi, Joongkyu Lee, Min-hwan Oh

NeurIPS 2025poster

Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits

Xuheng Li, Quanquan Gu

NeurIPS 2025posterarXiv:2511.02123

$\mathtt{VITS}$ : Variational Inference Thompson Sampling for contextual bandits

Pierre Clavier, Tom Huix, Alain Oliviero Durmus

ICML 2024poster

A Contextual Combinatorial Bandit Approach to Negotiation

Yexin Li, Zhancun Mu, Siyuan Qi

ICML 2024posterarXiv:2407.00567

Adaptively Learning to Select-Rank in Online Platforms

Jingyuan Wang, Perry Dong, Ying Jin et al.

ICML 2024posterarXiv:2406.05017

Borda Regret Minimization for Generalized Linear Dueling Bandits

Yue Wu, Tao Jin, Qiwei Di et al.

ICML 2024posterarXiv:2303.08816

Efficient Contextual Bandits with Uninformed Feedback Graphs

Mengxiao Zhang, Yuheng Zhang, Haipeng Luo et al.

ICML 2024posterarXiv:2402.08127

Fast and Sample Efficient Multi-Task Representation Learning in Stochastic Contextual Bandits

Jiabin Lin, Shana Moothedath, Namrata Vaswani

ICML 2024posterarXiv:2410.02068

High-dimensional Linear Bandits with Knapsacks

Wanteng Ma, Dong Xia, Jiashuo Jiang

ICML 2024posterarXiv:2311.01327

Incentivized Learning in Principal-Agent Bandit Games

Antoine Scheid, Daniil Tiapkin, Etienne Boursier et al.

ICML 2024posterarXiv:2403.03811

In-Context Reinforcement Learning for Variable Action Spaces

Viacheslav Sinii, Alexander Nikulin, Vladislav Kurenkov et al.

ICML 2024posterarXiv:2312.13327

Low-Rank Bandits via Tight Two-to-Infinity Singular Subspace Recovery

Yassir Jedra, William Réveillard, Stefan Stojanovic et al.

ICML 2024posterarXiv:2402.15739

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

Kaiwen Wang, Owen Oertell, Alekh Agarwal et al.

ICML 2024posterarXiv:2402.07198

Randomized Confidence Bounds for Stochastic Partial Monitoring

Maxime Heuillet, Ola Ahmad, Audrey Durand

ICML 2024posterarXiv:2402.05002

The Non-linear $F$-Design and Applications to Interactive Learning

Alekh Agarwal, Jian Qian, Alexander Rakhlin et al.

ICML 2024poster