ICLR Poster "evaluation metrics" Papers
2 papers found
CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models
Song Wang, Peng Wang, Tong Zhou et al.
ICLR 2025posterarXiv:2407.02408
13
citations
MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks
Jiacheng Chen, Tianhao Liang, Sherman Siu et al.
ICLR 2025posterarXiv:2410.10563
28
citations