NeurIPS Poster "sparse autoencoders" Papers
4 papers found
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit
Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.
NeurIPS 2025posterarXiv:2506.03093
10
citations
Large Language Models Think Too Fast To Explore Effectively
Lan Pan, Hanbo Xie, Robert Wilson
NeurIPS 2025posterarXiv:2501.18009
6
citations
Representation Consistency for Accurate and Coherent LLM Answer Aggregation
Junqi Jiang, Tom Bewley, Salim I. Amoukou et al.
NeurIPS 2025posterarXiv:2506.21590
2
citations
Revising and Falsifying Sparse Autoencoder Feature Explanations
George Ma, Samuel Pfrommer, Somayeh Sojoudi
NeurIPS 2025poster