"model capability assessment" Papers
3 papers found
BenchmarkCards: Standardized Documentation for Large Language Model Benchmarks
Anna Sokol, Elizabeth Daly, Michael Hind et al.
NeurIPS 2025posterarXiv:2410.12974
2
citations
MIEB: Massive Image Embedding Benchmark
Chenghao Xiao, Isaac Chung, Imene Kerboua et al.
ICCV 2025posterarXiv:2504.10471
6
citations
SynTSBench: Rethinking Temporal Pattern Learning in Deep Learning Models for Time Series
Qitai Tan, Yiyun Chen, Mo Li et al.
NeurIPS 2025oralarXiv:2510.20273