Tracing
Recording every step an agent takes (LLM calls, tool calls, memory reads, routing decisions) into a structured trace for debugging and audit.
Last updated: April 26, 2026
Definition
A trace is the full execution graph of one agent run: every LLM call with its prompt and response, every tool invocation with input and output, every memory retrieval, every routing decision. Each trace lives in an observability platform (Langfuse, LangSmith, Phoenix, Datadog) and is queryable by user, by session, by error class, by latency percentile. Without traces, debugging an agent that "sometimes does the wrong thing" is detective work without evidence. With traces, you reproduce the failing run in seconds.
The hardest part of tracing is keeping it cheap. Full traces of every agent run can produce gigabytes per day. Two patterns help: tail-based sampling (trace 100 percent of failed runs, 5 to 10 percent of successful ones), and selective field capture (log inputs and outputs but truncate the system prompt that's already cached). For multi-agent systems, propagate a trace ID through every sub-agent so you can stitch the full conversation together at debug time.
When To Use
Wire tracing in from day one of any production agent. Adding it after a production incident is too late.
Related Terms
Building with Tracing?
I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.