← Back to Glossary
Retrieval-Augmented Generation (RAG)
Definition
Retrieval-Augmented Generation (RAG) is an approach in which a language model retrieves relevant external information at inference time and incorporates it into the context to generate responses grounded in that retrieved data.
Purpose
The purpose of RAG is to improve factual accuracy and domain relevance of model outputs by supplementing the model's internal knowledge with up-to-date or domain-specific external information.
Key Characteristics
- Retrieval of external documents or data sources based on a query or context
- Injection of retrieved content into the model's context window before generation
- Dependence on retrieval quality for output accuracy
- Separation between information retrieval and text generation stages
- Stateless operation across individual model calls unless combined with memory mechanisms
Usage in Practice
In practice, RAG is used to answer knowledge-intensive questions, ground model responses in proprietary or current data, and reduce hallucinations by providing explicit reference material during inference.
One implementation of this concept is offered by Kenaz through the Semantic Engineering service.
