AAAI 2024 "benchmark evaluation" Papers
3 papers found
Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark
Fangjun Li, David C. Hogg, Anthony G. Cohn
AAAI 2024paperarXiv:2401.03991
51
citations
Benchmarking Large Language Models in Retrieval-Augmented Generation
Jiawei Chen, Hongyu Lin, Xianpei Han et al.
AAAI 2024paperarXiv:2309.01431
458
citations
RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting
Lei Shu, Liangchen Luo, Jayakumar Hoskere et al.
AAAI 2024paperarXiv:2305.15685
76
citations