NEURIPS 2025 "language model analysis" Papers
2 papers found
EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification
Lin Zhang, Wenshuo Dong, Zhuoran Zhang et al.
NEURIPS 2025posterarXiv:2502.06852
9
citations
Order-Level Attention Similarity Across Language Models: A Latent Commonality
Jinglin Liang, Jin Zhong, Shuangping Huang et al.
NEURIPS 2025posterarXiv:2511.05064