Poster "model evaluation" Papers
6 papers found
Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
Yuhui Zhang, Yuchang Su, Yiming Liu et al.
CVPR 2025posterarXiv:2501.03225
21
citations
Fine-tuning can Help Detect Pretraining Data from Large Language Models
Hengxiang Zhang, Songxin Zhang, Bingyi Jing et al.
ICLR 2025posterarXiv:2410.10880
4
citations
Do Large Language Models Perform the Way People Expect? Measuring the Human Generalization Function
Keyon Vafa, Ashesh Rambachan, Sendhil Mullainathan
ICML 2024poster
Feedback Loops With Language Models Drive In-Context Reward Hacking
Alexander Pan, Erik Jones, Meena Jagadeesan et al.
ICML 2024poster
Interplay of ROC and Precision-Recall AUCs: Theoretical Limits and Practical Implications in Binary Classification
Martin Mihelich, François Castagnos, Charles Dognin
ICML 2024poster
Rethinking Generative Large Language Model Evaluation for Semantic Comprehension
Fangyun Wei, Xi Chen, Lin Luo
ICML 2024poster