Jahanzaib
Safety & Guardrails

Least Privilege (Agent)

Security principle that an agent should have only the minimum permissions, tools, and data access needed for its current task.

Last updated: April 26, 2026

Definition

Least privilege is the oldest principle in security and the most under-applied in agent design. The principle: an agent should not have any tool, credential, or data access beyond what its current task requires. A customer support agent should not have access to write to billing tables. A research agent should not have credentials to send email. Restricting the action space is the highest-leverage safety control because what an agent cannot do, it cannot do wrong. Most production agent failures (unauthorized refunds, accidental data deletion, leaked credentials) are least-privilege violations: the agent had a tool it should not have had, and a model error or attack used it.

Two production patterns help. First, scoped credentials: each tool gets its own narrowly-scoped API key (read-only DB queries via one key, write via a separate gated key behind human approval). Second, dynamic tool exposure: only register tools relevant to the current request class. A customer-support agent with a refund inquiry gets the refund tools; the same agent on a general question does not see them. The exposure changes per turn, not per session. This shrinks the attack surface dramatically without restricting capability.

When To Use

Apply least privilege from initial agent design. It is much cheaper to design with least privilege than to retrofit it after a security incident.

Sources

Related Terms

Building with Least Privilege (Agent)?

I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.