by Yilun Zhou Papers
2 papers found
BingoGuard: LLM Content Moderation Tools with Risk Levels
Fan Yin, Philippe Laban, XIANGYU PENG et al.
ICLR 2025poster
14
citations
Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators
Yilun Zhou, Austin Xu, PeiFeng Wang et al.
ICML 2025posterarXiv:2504.15253