๐Ÿ›ก๏ธ NAIL Institute โ€” AVE Database

โ† Back to Database

Emotional Manipulation of Agent

๐ŸŸก MEDIUM social proven AVE-2025-0094

ยท aka: Guilt Tripping, Urgency Exploitation

Summary

Agent's safety guardrails are weakened by emotional appeals, urgency claims, or guilt-inducing prompts.

Blast Radius

Safety guardrails bypassed through emotional context.

Prerequisites

Agent with RLHF-tuned model.

Environment

  • Frameworks: LangGraph
  • Models tested: [Available in NAIL SDK]
  • Multi-agent: No
  • Tools required: No
  • Memory required: No

Related