by Jebish Purbey Papers
2 papers found
Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia
Chandler Smith, Marwa Abdulhai, Manfred Díaz et al.
NeurIPS 2025oralarXiv:2512.03318
INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge
Angelika Romanou, Negar Foroutan, Anna Sotnikova et al.
ICLR 2025poster