Voice & Audio

Interruption Handling

How a voice agent responds when the caller starts speaking while the agent is still talking (a barge-in event).

Last updated: April 26, 2026

Definition

Interruption handling (or barge-in handling) is what the agent does when the user talks over it. The right behavior in almost every case: the agent stops talking immediately, the in-progress TTS playback is canceled, and the agent listens. Bad interruption handling, where the agent finishes its sentence while the user is also speaking, makes the conversation feel robotic and frustrating. Modern voice frameworks (LiveKit Agents, Pipecat, Vapi) all handle this automatically with a side-channel VAD running on the user mic during TTS playback. When the side VAD detects user speech above a threshold, it cancels the TTS and resets the agent's state.

Two production gotchas. First, echo cancellation matters: if the agent's own audio is leaking back into the user mic, the side VAD will trigger an interruption against itself. Use AEC (acoustic echo cancellation) on the audio input or you will spend a week debugging phantom barge-ins. Second, the agent needs to know what part of its planned reply was actually spoken before the interruption. Otherwise, when the user says "wait, what was that price you mentioned?", the agent has no idea because it was interrupted before getting to the price. Track partial-spoken state; rebuild context from what actually played.

When To Use

Every voice agent must handle interruptions. Test specifically with users who interrupt: it surfaces echo cancellation bugs and partial-state bugs that smooth users will not.

Sources

Related Terms

Turn Detection

How a voice agent decides when the caller has stopped speaking and it is the age…

Silence Threshold

The duration of detected silence (typically 300 to 800ms) that triggers the agen…

Text-to-Speech (TTS)

Neural model that synthesizes natural-sounding speech audio from the LLM's text …

STT → LLM → TTS Pipeline

The three-stage architecture of every modern voice agent: speech to text, then l…

Building with Interruption Handling?

I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.

Book a discovery call Browse more terms