"model faithfulness" Papers
3 papers found
Faithful Model Explanations through Energy-Constrained Conformal Counterfactuals
Patrick Altmeyer, Mojtaba Farmanbar, Arie Van Deursen et al.
AAAI 2024paperarXiv:2312.10648
Interpretability Illusions in the Generalization of Simplified Models
Dan Friedman, Andrew Lampinen, Lucas Dixon et al.
ICML 2024posterarXiv:2312.03656
Saliency strikes back: How filtering out high frequencies improves white-box explanations
Sabine Muzellec, Thomas FEL, Victor Boutin et al.
ICML 2024posterarXiv:2307.09591