Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning

1citations

arXiv:2512.10978

Citations

#1428

in NeurIPS 2025

of 5858 papers

Authors

Data Points

Authors

Xueqi Ma Jun Wang Yanbei Jiang Sarah Erfani Tongliang Liu James Bailey

Topics

attention heads interpretability framework functional specialization cognitive functions reasoning abilities multi-class probing model interpretability chain-of-thought reasoning

Abstract

Large language models (LLMs) have achieved state-of-the-art performance in a variety of tasks, but remain largely opaque in terms of their internal mechanisms. Understanding these mechanisms is crucial to improve their reasoning abilities. Drawing inspiration from the interplay between neural processes and human cognition, we propose a novel interpretability framework to systematically analyze the roles and behaviors of attention heads, which are key components of LLMs. We introduce CogQA, a dataset that decomposes complex questions into step-by-step subquestions with a chain-of-thought design, each associated with specific cognitive functions such as retrieval or logical reasoning. By applying a multi-class probing method, we identify the attention heads responsible for these functions. Our analysis across multiple LLM families reveals that attention heads exhibit functional specialization, characterized as cognitive heads. These cognitive heads exhibit several key properties: they are universally sparse, vary in number and distribution across different cognitive functions, and display interactive and hierarchical structures. We further show that cognitive heads play a vital role in reasoning tasks - removing them leads to performance degradation, while augmenting them enhances reasoning accuracy. These insights offer a deeper understanding of LLM reasoning and suggest important implications for model design, training, and fine-tuning strategies.

Citation History

Jan 26, 2026

Jan 27, 2026

Feb 1, 2026

1+1