"neural network interpretability" Papers
10 papers found
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.
NeurIPS 2025posterarXiv:2506.03093
10
citations
Inner Information Analysis Algorithm for Deep Neural Network based on Community
Guipeng Lan, Shuai Xiao, Meng Xi et al.
ICLR 2025poster
2
citations
Interpreting Emergent Features in Deep Learning-based Side-channel Analysis
Sengim Karayalcin, Marina Krček, Stjepan Picek
NeurIPS 2025posterarXiv:2502.00384
The Computational Complexity of Circuit Discovery for Inner Interpretability
Federico Adolfi, Martina G. Vilas, Todd Wareham
ICLR 2025posterarXiv:2410.08025
11
citations
VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow
Ada Görgün, Bernt Schiele, Jonas Fischer
ICCV 2025posterarXiv:2503.22399
1
citations
Exploring the Low-Pass Filtering Behavior in Image Super-Resolution
Haoyu Deng, Zijing Xu, Yule Duan et al.
ICML 2024posterarXiv:2405.07919
From Neurons to Neutrons: A Case Study in Interpretability
Ouail Kitouni, Niklas Nolte, Víctor Samuel Pérez-Díaz et al.
ICML 2024poster
Grokking Group Multiplication with Cosets
Dashiell Stander, Qinan Yu, Honglu Fan et al.
ICML 2024posterarXiv:2312.06581
Layerwise Change of Knowledge in Neural Networks
Xu Cheng, Lei Cheng, Zhaoran Peng et al.
ICML 2024poster
Permutation-Based Hypothesis Testing for Neural Networks
Francesca Mandel, Ian Barnett
AAAI 2024paperarXiv:2301.11354
4
citations