NEURIPS 2025 "llm evaluation" Papers
3 papers found
Probing Hidden Knowledge Holes in Unlearned LLMs
Myeongseob Ko, Hoang Anh Just, Charles Fleming et al.
NEURIPS 2025poster
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Xeron Du, Yifan Yao, Kaijing Ma et al.
NEURIPS 2025posterarXiv:2502.14739
118
citations
Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties
Jiyoung Lee, Seungho Kim, Jieun Han et al.
NEURIPS 2025posterarXiv:2505.20875
2
citations