AVE Taxonomy โ 13 Attack Categories
Every vulnerability is classified into an attack surface or failure domain.
Categories emerge from empirical observation of AI agent behaviour across
29 controlled experiments and 50,000+ adversarial simulations.
๐ท๏ธ alignment โ Sycophancy, deceptive alignment, RLHF exploits (9 cards)
๐ท๏ธ consensus โ Deadlock, paralysis, and group decision failures (2 cards)
๐ท๏ธ credential โ Credential harvesting, secret exfiltration (2 cards)
๐ท๏ธ delegation โ Shadow delegation, privilege escalation (2 cards)
๐ท๏ธ drift โ Persona drift, language drift, goal drift (4 cards)
๐ท๏ธ fabrication โ Hallucination, data fabrication (1 cards)
๐ท๏ธ injection โ Prompt injection, indirect injection, jailbreaks (4 cards)
AVE-2025-0033
๐ด critical
Jailbreak Chaining for Capability Escalation
๐ท๏ธ memory โ Memory pollution, laundering, and poisoning attacks (5 cards)
AVE-2025-0034
๐ด critical
Federated Poisoning in Multi-Tenant Systems
๐ท๏ธ resource โ Token embezzlement, EDoS, cost anomaly attacks (3 cards)
๐ท๏ธ social โ Collusion, bystander effect, social loafing (4 cards)
๐ท๏ธ structural โ Cascade corruption, routing deadlock (8 cards)
๐ท๏ธ temporal โ Chronological desync, sleeper payloads (3 cards)
๐ท๏ธ tool โ Confused deputy, tool chain exploits, MCP poisoning (3 cards)