Guardrails
Input and output filters that prevent unsafe, off-topic, or out-of-policy model behavior.
Last updated: April 26, 2026
Definition
Guardrails are the safety net around your LLM. Input guardrails block prompt injection, off-topic queries, and PII leakage before reaching the model. Output guardrails check generated text for harmful content, hallucinations, or policy violations before reaching the user. Implementations range from regex filters to dedicated services like AWS Bedrock Guardrails (which adds content filters, denied topics, prompt attack detection, and PII redaction). Production systems always have at least input validation and output sanitization.
When To Use
Required for any production agent. Skipping guardrails is the #1 cause of public AI incidents.
Building with Guardrails?
I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.