RAG & Retrieval

Advanced RAG

RAG enhanced with pre-retrieval (query rewriting, expansion) and post-retrieval (reranking, contextual filtering) optimization steps.

Last updated: April 26, 2026

Definition

Advanced RAG adds steps before and after the basic retrieve-and-generate flow. Pre-retrieval: rewrite the user's vague query into a more retrieval-friendly form, expand it into multiple variants, or decompose it into sub-queries. Post-retrieval: rerank candidates with a cross-encoder model, filter by metadata, deduplicate near-identical chunks, or compress long retrievals to fit the context window. Each step trades cost (more LLM calls, more compute) for retrieval quality. Advanced RAG is the standard production pattern as of 2026; naive RAG only survives in prototypes.

The two upgrades that pay back fastest in practice: query rewriting (have a small LLM call rewrite the user's question into a better retrieval query before embedding) and reranking (retrieve top 50 by vector similarity, rerank to top 5 with a cross-encoder). Together they typically improve recall@5 by 20 to 40 percent on benchmark RAG tasks. Anthropic's "Contextual Retrieval" technique (prepending a chunk-specific summary before embedding) gives another 30 to 50 percent improvement on certain workloads.

When To Use

Move from naive to advanced RAG when retrieval quality plateaus. Start with reranking; it is the highest-leverage single addition.

Sources

Related Terms

RAG (Retrieval-Augmented Generation)

Fetching relevant documents at query time and injecting them into the LLM prompt…

Naive RAG

The basic retrieve-then-generate RAG pattern: embed the query, fetch top-K chunk…

Agentic RAG

RAG where the LLM autonomously decides what to retrieve and when, instead of one…

Reranking

A second-stage model that re-orders retrieved chunks by true relevance, not just…

Hybrid Search

Search that combines dense vector similarity with traditional keyword (BM25) sco…

Building with Advanced RAG?

I've shipped this pattern in real production systems. If you want a second pair of eyes on your architecture, that's what I do.

Book a discovery call Browse more terms