by Noah Zheutlin Papers
2 papers found
ITBench: Evaluating AI Agents across Diverse Real-World IT Automation Tasks
Saurabh Jha, Rohan Arora, Yuji Watanabe et al.
ICML 2025oralarXiv:2502.05352
STRATUS: A Multi-agent System for Autonomous Reliability Engineering of Modern Clouds
Yinfang Chen, Jiaqi Pan, Jackson Clark et al.
NeurIPS 2025poster
2
citations