"causal intervention" Papers
2 papers found
Concept-Guided Interpretability via Neural Chunking
Shuchen Wu, Stephan Alaniz, Shyamgopal Karthik et al.
NeurIPS 2025posterarXiv:2505.11576
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Javier Ferrando, Oscar Obeso, Senthooran Rajamanoharan et al.
ICLR 2025posterarXiv:2411.14257
77
citations