"language model evaluation" Papers
5 papers found
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Jiayi Ye, Yanbo Wang, Yue Huang et al.
ICLR 2025posterarXiv:2410.02736
207
citations
Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?
Andreas Opedal, Alessandro Stolfo, Haruki Shirakami et al.
ICML 2024poster
LatestEval: Addressing Data Contamination in Language Model Evaluation through Dynamic and Time
Sensitive Test Construction - Yucheng Li, Frank Guerin, Chenghua Lin
AAAI 2024paperarXiv:2312.12343
53
citations
Open-Domain Text Evaluation via Contrastive Distribution Methods
Sidi Lu, Hongyi Liu, Asli Celikyilmaz et al.
ICML 2024poster
Task Contamination: Language Models May Not Be Few-Shot Anymore
Changmao Li, Jeffrey Flanigan
AAAI 2024paperarXiv:2312.16337
130
citations