"out-of-distribution evaluation" Papers
3 papers found
Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization
Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio
CVPR 2025posterarXiv:2503.03519
2
citations
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
Kaijing Ma, Xeron Du, Yunran Wang et al.
ICLR 2025posterarXiv:2410.06526
53
citations
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
Shulin Huang, Linyi Yang, Yan Song et al.
NeurIPS 2025posterarXiv:2502.16268
14
citations