"steering vectors" Papers
2 papers found
DISCO: Disentangled Communication Steering for Large Language Models
Max Torop, Aria Masoomi, Masih Eskandar et al.
NeurIPS 2025posterarXiv:2509.16820
LayerNavigator: Finding Promising Intervention Layers for Efficient Activation Steering in Large Language Models
Hao Sun, Huailiang Peng, Qiong Dai et al.
NeurIPS 2025oral