Any time someone talks about “guardrails” on an AI prompt, it usually involves additional instructions in the prompt to “behave yourself”.

Atomic Object is a software consultancy specializing in custom software development and product design. Readers can learn about software development methodologies, user-centered design principles, and agile practices. With case studies, blog posts, and technical insights, Atomic Object provides  guidance and expertise for building innovative and user-friendly software products.

Atomic Spin

AI guardrails implemented as natural language instructions in prompts are fundamentally unreliable because LLMs have no equivalent to SQL's parameterized queries. Unlike SQL injection, which can be decisively fixed with parameterized queries, prompt injection is an intrinsic weakness of LLMs — all inputs and instructions share the same pipeline. Guardrails that rely on blacklisting bad behaviors or instructing the model to 'behave' are insufficient, as attacks can use hidden Unicode characters, HTML comments, or simple override instructions invisible to human reviewers but processed by the model.

Your AI “Guardrails” Are Just Suggestions