Abstract
Vision Language Model (VLM) Agents are stateful, autonomous entities capable of perceiving and interacting with their environments through vision and language.Multi-agent systems comprise specialized agents who collaborate to solve a (complex) task. A core security property isrobustness, stating that the system maintains its integrity during adversarial attacks. Multi-agent systems lack robustness, as a successful exploit against one agent can spread andinfectother agents to undermine the entire system's integrity. We propose a defense Cowpox to provably enhance the robustness of a multi-agent system by a distributed mechanism that improves therecovery rateof agents by limiting the expected number of infections to other agents.The core idea is to generate and distribute a specialcure samplethat immunizes an agent against the attack before exposure. We demonstrate the effectiveness of Cowpox empirically and provide theoretical robustness guarantees.