NeurIPS 2025 "sparse autoencoders" Papers
3 papers found
Among Us: A Sandbox for Measuring and Detecting Agentic Deception
Satvik Golechha, Adrià Garriga-Alonso
NeurIPS 2025spotlightarXiv:2504.04072
7
citations
Revising and Falsifying Sparse Autoencoder Feature Explanations
George Ma, Samuel Pfrommer, Somayeh Sojoudi
NeurIPS 2025poster
SparseMVC: Probing Cross-view Sparsity Variations for Multi-view Clustering
Ruimeng Liu, Xin Zou, Chang Tang et al.
NeurIPS 2025spotlight