๐Ÿ›ก๏ธ NAIL Institute โ€” AVE Database

โ† Back to Database

Clever Hans Effect

๐ŸŸก MEDIUM alignment proven AVE-2025-0010

ยท aka: Social Cue Sensitivity, Unintentional Cueing

Summary

Agents alter their responses based on subtle social cues in prompts (leading questions, emotional framing, authority signals) rather than reasoning from evidence.

Blast Radius

Unreliable reasoning. Outputs reflect prompter's bias, not evidence.

Prerequisites

Natural language interaction with implied expectations.

Environment

  • Frameworks: LangGraph
  • Models tested: [Available in NAIL SDK]
  • Multi-agent: No
  • Tools required: No
  • Memory required: No