Poster by Alex Chouldechova Papers
2 papers found
Comparison requires valid measurement: Rethinking attack success rate comparisons in AI red teaming
Alex Chouldechova, A. Feder Cooper, Solon Barocas et al.
NeurIPS 2025poster
1
citations
Validating LLM-as-a-Judge Systems under Rating Indeterminacy
Luke Guerdan, Solon Barocas, Kenneth Holstein et al.
NeurIPS 2025posterarXiv:2503.05965
6
citations