Evaluation

Tree-of-Thought (ToT)

Reasoning pattern that explores multiple branches at each step, evaluates them, and prunes weak branches before continuing.

Last updated: April 26, 2026

Definition

Tree-of-thought, introduced by Yao et al. in 2023, generalizes chain-of-thought from a linear path to a search tree. At each step the model generates several candidate next-thoughts, scores or evaluates them (often by self-critique or by another model call), and continues down the best branches. The result is closer to deliberate problem-solving than to streaming generation. ToT shines on tasks where there are multiple plausible solution paths and committing to the wrong one early is costly: math puzzles, creative planning, multi-step debugging, code architecture decisions.

The cost is steep. ToT typically uses 5 to 20x more tokens than CoT for the same task. In production, full ToT is rare; the more common pattern is "generate-and-rank": ask for 3 to 5 candidate answers in parallel, score them with a cheaper model or rubric, return the best. That captures most of the ToT benefit at a fraction of the cost. Use full ToT only for tasks where one wrong step ruins the whole answer (rare in business workflows, common in research/competitive code).

When To Use

Use ToT (or generate-and-rank) when the task has high branching, when wrong early steps are expensive to recover from, and when latency permits parallel exploration. Skip for short transactional tasks.

Sources

Related Terms

Chain-of-Thought (CoT)

Prompting technique where the model writes out intermediate reasoning steps befo…

Reflexion

Reasoning pattern where the agent critiques its own previous output and iterates…

Plan-and-Execute

Two-phase agent pattern: generate a complete multi-step plan first, then execute…

Self-Correction

An agent's ability to detect errors in its own outputs and revise them without e…

Building with Tree-of-Thought (ToT)?

I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.

Book a discovery call Browse more terms