Buffer Memory
Short-term memory pattern that stores the full recent conversation history verbatim until token budget is hit.
Last updated: April 26, 2026
Definition
Buffer memory keeps every message of the conversation in the LLM context window unchanged. The simplest possible memory pattern: just append every user message and assistant response to the messages array and re-send the whole thing on each turn. This works perfectly until the conversation grows past the context window, at which point it fails catastrophically. Buffer memory is the right default for short conversations (under 20 turns), prototypes, and any agent where conversation rarely exceeds a few thousand tokens.
When To Use
Use buffer memory for prototypes and any agent where conversations are bounded short. Switch to summary memory or sliding window the moment conversation length is unpredictable.
Sources
Related Terms
Building with Buffer Memory?
I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.