Skip to content Skip to footer

An introductory manual for developing a Retrieval Augmented Generation (RAG) application from the ground up | Authored by Bill Chambers.

Retrieval Augmented Generation (RAG) has recently been gaining attention as it provides new possibilities for large language models like OpenAI’s GPT-4 to use and leverage their own data. This technique essentially involves adding one’s own data (via a retrieval tool) to the prompt that is passed into a language model which then generates an output. This has several benefits including the ability to include facts in the prompt to help the language model avoid hallucinations, refer to sources of truth when responding to user queries, and leverage data that the language model may not have been trained on.

A RAG system is essentially composed of of three main components:

1. A collection of documents (known formally as a corpus)
2. An input from the user
3. A similarity measure between the collection of documents and the user input

To implement a simple RAG system, one would follow these steps in sequence:

1. Receive a user input
2. Perform the similarity measure to identify most related document(s)
3. Post-process the user input and the retrieved document(s) with a language model

An example of building a simple RAG application can start with creating a simple corpus of ‘documents’. A Jaccard similarity function can be used to measure similarity between user input and each document in the corpus.

A significant consideration in the construction of the RAG system is post-processing of the retrieved document using a large language model (LLM). Here, an open source LLM such as llama2 can be used. The user input and the similar document obtained using the similarity measure are fed as inputs to the LLM.

The LLM post-processing stage presents an opportunity for improvement of the basic RAG application. This can be done through changing the LLM model used, adjusting the prompt given to the LLM, enlarging or improving the quality of the document collection, changing the similarity measure, or implementing a mechanism for detecting and handling harmful or toxic output.

The possibilities for optimization are vast and will be discussed in future tutorials. Building a RAG application from scratch is a great way to learn about the system then leverage libraries for scaling.

Leave a comment

0.0/5