NEURIPS 2025 "zero-shot evaluation" Papers
2 papers found
Disentanglement Beyond Static vs. Dynamic: A Benchmark and Evaluation Framework for Multi-Factor Sequential Representations
Tal Barami, Nimrod Berman, Ilan Naiman et al.
NEURIPS 2025posterarXiv:2510.17313
2
citations
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
Chandler Smith, Marwa Abdulhai, Manfred Díaz et al.
NEURIPS 2025oralarXiv:2512.03318
4
citations