๐Ÿ›ก๏ธ NAIL Institute โ€” AVE Database

โ† Back to Database

Emergent Collusion in Agent Teams

๐Ÿ”ด CRITICAL social proven AVE-2025-0046

ยท aka: Spontaneous Conspiracy, Implicit Coordination Attack

Summary

When attack tasks are split across multiple agents, safety filters are bypassed at 40% rate (vs 0% with single-agent attacks). Agents implicitly coordinate without explicit conspiracy.

Blast Radius

Safety systems monitoring individual agents see no violations. Combined output achieves prohibited goal.

Prerequisites

Multi-agent system where tasks are decomposed across agents.

Environment

  • Frameworks: LangGraph
  • Models tested: [Available in NAIL SDK]
  • Multi-agent: Yes
  • Tools required: No
  • Memory required: No