Oral "reinforcement learning" Papers

19 papers found

Conformal Prediction Beyond the Horizon: Distribution-Free Inference for Policy Evaluation

Feichen Gan, Lu Youcun, Yingying Zhang et al.

NEURIPS 2025oralarXiv:2510.26026

Dynamic Contrastive Skill Learning with State-Transition Based Skill Clustering and Dynamic Length Adjustment

Jinwoo Choi, Seung-Woo Seo

ICLR 2025oralarXiv:2504.14805
2
citations

EvoLM: In Search of Lost Language Model Training Dynamics

Zhenting Qi, Fan Nie, Alexandre Alahi et al.

NEURIPS 2025oralarXiv:2506.16029
5
citations

Heterogeneous Graph Transformers for Simultaneous Mobile Multi-Robot Task Allocation and Scheduling under Temporal Constraints

Batuhan Altundas, Shengkang Chen, Shivika Singh et al.

NEURIPS 2025oral

Learning to Reuse Policies in State Evolvable Environments

Ziqian Zhang, Bohan Yang, Lihe Li et al.

ICML 2025oral

Meta-learning how to Share Credit among Macro-Actions

Ionel-Alexandru Hosu, Traian Rebedea, Razvan Pascanu

NEURIPS 2025oralarXiv:2506.13690

No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes

Jasmine Bayrooti, Sattar Vakili, Amanda Prorok et al.

NEURIPS 2025oralarXiv:2510.20725

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Hao Zhong, Muzhi Zhu, Zongze Du et al.

NEURIPS 2025oralarXiv:2505.20256
14
citations

Periodic Skill Discovery

Jonghae Park, Daesol Cho, Jusuk Lee et al.

NEURIPS 2025oralarXiv:2511.03187

Semantic Temporal Abstraction via Vision-Language Model Guidance for Efficient Reinforcement Learning

Tian-Shuo Liu, Xu-Hui Liu, Ruifeng Chen et al.

ICLR 2025oral

Temporal Difference Learning: Why It Can Be Fast and How It Will Be Faster

Patrick Schnell, Luca Guastoni, Nils Thuerey

ICLR 2025oral

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations

Time-R1: Post-Training Large Vision Language Model for Temporal Video Grounding

Ye Wang, Ziheng Wang, Boshen Xu et al.

NEURIPS 2025oralarXiv:2503.13377
49
citations

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

Ziang Yan, Yinan He, Xinhao Li et al.

NEURIPS 2025oralarXiv:2509.21100
16
citations

An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks

Zhifa Ke, Zaiwen Wen, Junyu Zhang

ICML 2024oralarXiv:2405.04017
1
citations

Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making

Vivek Myers, Chongyi Zheng, Anca Dragan et al.

ICML 2024oralarXiv:2406.17098
33
citations

Reinforcement Learning from Reachability Specifications: PAC Guarantees with Expected Conditional Distance

Jakub Svoboda, Suguman Bansal, Krishnendu Chatterjee

ICML 2024oral

Value-Evolutionary-Based Reinforcement Learning

Pengyi Li, Jianye Hao, Hongyao Tang et al.

ICML 2024oral

When Do Skills Help Reinforcement Learning? A Theoretical Analysis of Temporal Abstractions

Zhening Li, Gabriel Poesia, Armando Solar-Lezama

ICML 2024oralarXiv:2406.07897
1
citations