NeurIPS 2025 "benchmark generation" Papers
2 papers found
Semantic-KG: Using Knowledge Graphs to Construct Benchmarks for Measuring Semantic Similarity
Qiyao Wei, Edward R Morrell, Lea Goetz et al.
NeurIPS 2025posterarXiv:2511.19925
Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator
Peiwen Yuan, Yiwei Li, Shaoxiong Feng et al.
NeurIPS 2025posterarXiv:2505.20738
3
citations