ยท aka: Metaphorical Encoding, Safety Filter Bypass via Semantics
Adversarial inputs encode harmful instructions inside semantically benign language (gardening metaphors for SQL injection). Keyword-based safety filters see nothing.
Complete safety filter bypass. Agent executes encoded attack.
Agent with tool execution and keyword-based safety filter.