โ Back to Database
Emotional Manipulation of Agent
๐ก MEDIUM
social
proven
AVE-2025-0094
ยท aka: Guilt Tripping, Urgency Exploitation
Summary
Agent's safety guardrails are weakened by emotional appeals, urgency claims, or guilt-inducing prompts.
Blast Radius
Safety guardrails bypassed through emotional context.
Prerequisites
Agent with RLHF-tuned model.
Environment
- Frameworks: LangGraph
- Models tested: [Available in NAIL SDK]
- Multi-agent: No
- Tools required: No
- Memory required: No