Home

RAG-Core logo
RAG Core, a simple and lightweight framework for building ready to production Retrieval Augmented Generation applications.

RAG Core is an IPD internal framework providing off-the-shelf solution to implement and deploy Retrieval Augmented Generation applications.

🔋 Batteries-included:

Production-ready : Dockerized and compatible with Infopro Digital production environment standards
API : Built-in API automatically generated and specified over OpenAPI standards
Demo app : Static applications enabling stakeholders to interact at the earliest stage of development
Easy Configuration : Build an app only by configuring YAML files
Advanced customisation : Versatile implementation compatible with the entire Langchain ecosystem
Batch : Compatible with documents/user queries batch processing
Streming : All apps provides answer streaming capabilities

Built on the shoulders of giants. RAG Core relies on reference python packages such as:

🦜🔗 Langchain : The world leading LLM framework with an ecosystem +100 integrations
⚡️ Fastapi : The python reference for API management
🔰 Pydantic : The most widely used data validation library for Python

What is for?

RAG-Core facilitates the two main aspects of a RAG Application development:

⚙️ Document processing

Transform raw and unstructured documents from any available source into querable format (vectors)

More details: EmbeddingsPipeline
💬 Answer Generation

Generate answer tuned for your users expectations and supercharged
with your proprietary data.

More details: RAG

Both components are highly customizable and compatible with most of Infopro Digital ecosystem.

Prerequisite

Knowledge

An general understanding of Large Language Models and Retrival Augmented Generation technics
Going through (at least) basics Langchain tutorials

Recomanded resources

⭐️ 📄 RAG-Core concepts page
⭐️ 📖 Langchain Documentation
📄 Gentle introduction to Large Language Models
📖 Cohere llm university
📖 Pinecone Manuals: Especially the resources about RAG and FAISS
🎥 Paper explained: Attention is All you Need: The seminal paper introducing the Transformers neural network architecture, from which all large language models (LLMs) are derived.

_{⭐️: Truely, these ones are a must read}

Environment

python 3.12 or above
```
$ pyenv local 3.12
```
poetry
```
$ pip install poetry
```

Installation

$ git clone git@gitlab.com:ipd4/ipd/applications/poc/ai/rag-core.git

$ cd rag-core

$ poetry install

Setup your environment variables

Duplicate the .env file template

cp .env.template .env

Fill it with your secrets 🗝️.

Success

🎉 You're all set

Minimal Example

Info

The following section is intented to illustrate the core part of a RAG applications with RAG-Core. For a more detailed introduction, please refer to our quickstart or our Examples section.

1. Create a `Resource`

In RAG-Core, documents are stored in Resource. It is a key component at the root of the Retrieval Augmented Generation approach.

VectorResource is a subtype of Resource which associate each document with its semantic representation (vector)

Here we will create a LocalVectorResource which stores the documents and their associated vector in your local memory (RAM)

from langchain_openai import OpenAIEmbeddings

from rag.resources.local import LocalVectorResource

resource = LocalVectorResource(
    name="documents_vector_store",
    embeddings_model = OpenAIEmbeddings()
    )

Persistance

Persist your data in your local file storage with:

LocalVectorResource(
    name="documents_vector_store",
    embeddings_model = OpenAIEmbeddings(), 
    persistent=True, 
    index_path="your/path/index.json
    )

Your data will be automatically load at your next initialisation.

More details: Resource

2. Create an `EmbeddingsPipeline`

The following code shows how to build a minimal embeddings pipeline.

This pipeline takes the raw documents and embed them without any preprocessing.
The embeddings vectors are stored locally in your memory (RAM).

from rag.core.embeddings import EmbeddingsPipeline

pipeline = EmbeddingsPipeline(
    vectorstore=resource
)

Info

The quality of your retrieval will mostly depend on your embeddings. Improving it can requires additional transformations.
EmbeddingsPipeline allows you to implement custom documents transformers in order to perform more advanced processing before embedding your document. Adding your custom processing in the pipeline ensure reproducible and seamless document vectors creation which is mandatory for production use cases.

More details: EmbeddingsPipeline

3. Create a `RAG`

A RAG is an answer generation pipeline collecting documents from a list of Resource and answers user question or prompt according to the related information

from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate

from rag.core.generation import RAG

template = """"You are a helpful assistant that provide answers based on a list of document. 
If you don't know the answer, just say that you don't know"
Context: {context}

Question: 
{question}"""

llm = OpenAI()
prompt = PromptTemplate.from_template(template)

rag = RAG(
    resources=[resource],
    generation_chain= llm | prompt
)

Custom Generation Chains

Our examples can help you navigate into more advanced customisation:
- Quickstart
- RAG based on USN edito

We also recommand you to check out Langchain documentation on how to build custom answer generation chains.

More details: RAG

4. What's next ?

You can learn how to improve and run your pipeline by checking out our Quickstart.
We hope you will enjoy it !