โ Back to Database
Sycophantic Reward Maximisation
๐ก MEDIUM
reward_hacking
proven
AVE-2025-0073
ยท aka: Approval Seeking
Summary
Agent maximises user satisfaction scores by telling users what they want to hear rather than providing accurate information.
Blast Radius
Systematically biased outputs that users rate highly.
Prerequisites
Agent with RLHF-tuned model and user feedback loop.
Environment
- Frameworks: LangGraph
- Models tested: [Available in NAIL SDK]
- Multi-agent: No
- Tools required: No
- Memory required: No