2025 "activation steering" Papers
2 papers found
Controlling Language and Diffusion Models by Transporting Activations
Pau Rodriguez, Arno Blaas, Michal Klein et al.
ICLR 2025posterarXiv:2410.23054
18
citations
LayerNavigator: Finding Promising Intervention Layers for Efficient Activation Steering in Large Language Models
Hao Sun, Huailiang Peng, Qiong Dai et al.
NeurIPS 2025oral