2025 "model-agnostic methods" Papers
2 papers found
KAIROS: Scalable Model-Agnostic Data Valuation
Jiongli Zhu, Parjanya Prashant, Alex Cloninger et al.
NeurIPS 2025posterarXiv:2506.23799
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation
Xinpeng Wang, Chengzhi (Martin) Hu, Paul Röttger et al.
ICLR 2025posterarXiv:2410.03415
24
citations