Controlling Large Language Model with Latent Action

0
Citations
#746
in ICML 2025
of 3340 papers
7
Authors
1
Data Points

Abstract

Adapting Large Language Models (LLMs) to downstream tasks using Reinforcement Learning (RL) has proven to be an effective approach. However, LLMs do not inherently define the structure of an agent for RL training, particularly in terms of specifying the action space. This paper studies learning a compact latent action space to enhance the controllability and exploration of RL for LLMs. Inspired by reinforcement learning from observations, we proposeControlling Large Language Models withLatentActionsCoLA, a framework that integrates a latent action space into pre-trained LLMs.CoLAemploys an inverse dynamics model to extract latent actions conditioned on future tokens, ensuring that the next token prediction is partially influenced by these actions. Simultaneously,CoLAfine-tunes the pre-trained LLM to function as a language world model, capable of incorporating latent actions as inputs. Additionally,CoLAtrains a policy model to generate actions within this language world model. The policy model can be trained via behavior cloning to mimic a standard language model or through RL to maximize task-specific rewards. In this work, we applyCoLAto the Llama-3.1-8B model. Our experiments demonstrate that, compared to RL with token-level actions,CoLA's latent actions enable greater semantic diversity. For enhancing downstream tasks, we show thatCoLAwith RL achieves a score of 42.4 on the math500 benchmark, surpassing the baseline score of 38.2, and reaches 68.2 when augmented with a Monte Carlo Tree Search variant. Furthermore,CoLAwith RL consistently improves performance on agent-based tasks without degrading the pre-trained LLM's capabilities, unlike the baseline. Finally,CoLAreduces computation time by half in tasks involving enhanced thinking prompts for LLMs via RL. These results highlightCoLA's potential to advance RL-based adaptation of LLMs for downstream applications. The CoLA model is available at \url{https://huggingface.co/LAMDA-RL/Llama-3.1-CoLA-10B}.

Citation History

Jan 28, 2026
0