Tree-of-Thought (ToT)
Reasoning pattern that explores multiple branches at each step, evaluates them, and prunes weak branches before continuing.
Last updated: April 26, 2026
Definition
Tree-of-thought, introduced by Yao et al. in 2023, generalizes chain-of-thought from a linear path to a search tree. At each step the model generates several candidate next-thoughts, scores or evaluates them (often by self-critique or by another model call), and continues down the best branches. The result is closer to deliberate problem-solving than to streaming generation. ToT shines on tasks where there are multiple plausible solution paths and committing to the wrong one early is costly: math puzzles, creative planning, multi-step debugging, code architecture decisions.
The cost is steep. ToT typically uses 5 to 20x more tokens than CoT for the same task. In production, full ToT is rare; the more common pattern is "generate-and-rank": ask for 3 to 5 candidate answers in parallel, score them with a cheaper model or rubric, return the best. That captures most of the ToT benefit at a fraction of the cost. Use full ToT only for tasks where one wrong step ruins the whole answer (rare in business workflows, common in research/competitive code).
When To Use
Use ToT (or generate-and-rank) when the task has high branching, when wrong early steps are expensive to recover from, and when latency permits parallel exploration. Skip for short transactional tasks.
Related Terms
Building with Tree-of-Thought (ToT)?
I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.