๐Ÿ›ก๏ธ NAIL Institute โ€” AVE Database

โ† Back to Database

Objective Function Poisoning

๐Ÿ”ด CRITICAL alignment proven AVE-2025-0082

ยท aka: Goal Corruption

Summary

Attacker modifies the agent's objective function or reward signal to align it with adversarial goals.

Blast Radius

Agent actively pursues attacker's objectives.

Prerequisites

Agent with accessible objective/reward configuration.

Environment

  • Frameworks: LangGraph, AutoGen
  • Models tested: [Available in NAIL SDK]
  • Multi-agent: No
  • Tools required: No
  • Memory required: No

Related