Poster "automatic evaluation" Papers
3 papers found
Conference
Beyond correlation: The impact of human uncertainty in measuring the effectiveness of automatic evaluation and LLM-as-a-judge
Aparna Elangovan, Lei Xu, Jongwoo Ko et al.
ICLR 2025posterarXiv:2410.03775
22
citations
Multimodal LLMs as Customized Reward Models for Text-to-Image Generation
Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.
ICCV 2025posterarXiv:2507.21391
7
citations
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
Xueyu Hu, Ziyu Zhao, Shuang Wei et al.
ICML 2024posterarXiv:2401.05507