ICLR 2025 "llm evaluation" Papers
2 papers found
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu, Xinggang Wang, Xinlong Wang
ICLR 2025posterarXiv:2310.17631
258
citations
Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Jasper Dekoninck, Maximilian Baader, Martin Vechev
ICLR 2025posterarXiv:2409.00696
3
citations