ICLR Poster "model interpretability" Papers
6 papers found
AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution
Fengyuan Liu, Nikhil Kandpal, Colin Raffel
ICLR 2025posterarXiv:2411.15102
12
citations
Concept Bottleneck Language Models For Protein Design
Aya Ismail, Tuomas Oikarinen, Amy Wang et al.
ICLR 2025posterarXiv:2411.06090
13
citations
Data-centric Prediction Explanation via Kernelized Stein Discrepancy
Mahtab Sarvmaili, Hassan Sajjad, Ga Wu
ICLR 2025posterarXiv:2403.15576
2
citations
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang, Yifei Liu, Yingdong Shi et al.
ICLR 2025posterarXiv:2503.09046
4
citations
From Search to Sampling: Generative Models for Robust Algorithmic Recourse
Prateek Garg, Lokesh Nagalapatti, Sunita Sarawagi
ICLR 2025posterarXiv:2505.07351
2
citations
Start Smart: Leveraging Gradients For Enhancing Mask-based XAI Methods
Buelent Uendes, Shujian Yu, Mark Hoogendoorn
ICLR 2025poster