Retrieval-Augmented Generation (RAG)

Definition

Retrieval-Augmented Generation (RAG) is an approach in which a language model retrieves relevant external information at inference time and incorporates it into the context to generate responses grounded in that retrieved data.

Purpose

The purpose of RAG is to improve factual accuracy and domain relevance of model outputs by supplementing the model's internal knowledge with up-to-date or domain-specific external information.

Key Characteristics

Retrieval of external documents or data sources based on a query or context
Injection of retrieved content into the model's context window before generation
Dependence on retrieval quality for output accuracy
Separation between information retrieval and text generation stages
Stateless operation across individual model calls unless combined with memory mechanisms

Usage in Practice

In practice, RAG is used to answer knowledge-intensive questions, ground model responses in proprietary or current data, and reduce hallucinations by providing explicit reference material during inference.

One implementation of this concept is offered by Kenaz through the Semantic Engineering service.

← Back to Glossary

Retrieval-Augmented Generation (RAG)

Definition

Purpose

Key Characteristics

Usage in Practice

Related Terms