Home
RAG Core, a simple and lightweight framework for building ready to production Retrieval Augmented Generation applications.
RAG Core is an IPD internal framework providing off-the-shelf solution to implement and deploy Retrieval Augmented Generation applications.
🔋 Batteries-included:
- Production-ready : Dockerized and compatible with Infopro Digital production environment standards
- API : Built-in API automatically generated and specified over OpenAPI standards
- Demo app : Static applications enabling stakeholders to interact at the earliest stage of development
- Easy Configuration : Build an app only by configuring YAML files
- Advanced customisation : Versatile implementation compatible with the entire Langchain ecosystem
- Batch : Compatible with documents/user queries batch processing
- Streming : All apps provides answer streaming capabilities
Built on the shoulders of giants. RAG Core relies on reference python packages such as:
- 🦜🔗 Langchain : The world leading LLM framework with an ecosystem +100 integrations
- ⚡️ Fastapi : The python reference for API management
- 🔰 Pydantic : The most widely used data validation library for Python
What is for?
RAG-Core facilitates the two main aspects of a RAG Application development:
-
⚙️ Document processing
Transform raw and unstructured documents from any available source into querable format (vectors)
-
💬 Answer Generation
Generate answer tuned for your users expectations and supercharged
with your proprietary data.
Both components are highly customizable and compatible with most of Infopro Digital ecosystem.
Prerequisite
Knowledge
- An general understanding of Large Language Models and Retrival Augmented Generation technics
- Going through (at least) basics Langchain tutorials
Recomanded resources
- ⭐️ 📄 RAG-Core concepts page
- ⭐️ 📖 Langchain Documentation
- 📄 Gentle introduction to Large Language Models
- 📖 Cohere llm university
- 📖 Pinecone Manuals: Especially the resources about RAG and FAISS
- 🎥 Paper explained: Attention is All you Need: The seminal paper introducing the Transformers neural network architecture, from which all large language models (LLMs) are derived.
⭐️: Truely, these ones are a must read
Environment
- python 3.12 or above
- poetry
Installation
Setup your environment variables
Duplicate the .env file template
Fill it with your secrets 🗝️.
Success
🎉 You're all set
Minimal Example
Info
The following section is intented to illustrate the core part of a RAG applications with RAG-Core. For a more detailed introduction, please refer to our quickstart or our Examples section.
1. Create a Resource
In RAG-Core, documents are stored in Resource. It is a key component at the root of the Retrieval Augmented Generation approach.
VectorResource is a subtype of Resource which associate each document with its semantic representation (vector)
Here we will create a LocalVectorResource which stores the documents and their associated vector in your local memory (RAM)
from langchain_openai import OpenAIEmbeddings
from rag.resources.local import LocalVectorResource
resource = LocalVectorResource(
name="documents_vector_store",
embeddings_model = OpenAIEmbeddings()
)
Persistance
Persist your data in your local file storage with:
LocalVectorResource(
name="documents_vector_store",
embeddings_model = OpenAIEmbeddings(),
persistent=True,
index_path="your/path/index.json
)
Your data will be automatically load at your next initialisation.
2. Create an EmbeddingsPipeline
The following code shows how to build a minimal embeddings pipeline.
This pipeline takes the raw documents and embed them without any preprocessing.
The embeddings vectors are stored locally in your memory (RAM).
from rag.core.embeddings import EmbeddingsPipeline
pipeline = EmbeddingsPipeline(
vectorstore=resource
)
Info
The quality of your retrieval will mostly depend on your embeddings.
Improving it can requires additional transformations.
EmbeddingsPipeline allows you to implement custom documents transformers in order to perform more advanced processing before embedding your document.
Adding your custom processing in the pipeline ensure reproducible and seamless document vectors creation which is mandatory for production use cases.
3. Create a RAG
A RAG is an answer generation pipeline collecting documents from a list of Resource and answers user question or prompt according to the related information
from langchain_openai import OpenAI
from langchain_core.prompts import PromptTemplate
from rag.core.generation import RAG
template = """"You are a helpful assistant that provide answers based on a list of document.
If you don't know the answer, just say that you don't know"
Context: {context}
Question:
{question}"""
llm = OpenAI()
prompt = PromptTemplate.from_template(template)
rag = RAG(
resources=[resource],
generation_chain= llm | prompt
)
Custom Generation Chains
Our examples can help you navigate into more advanced customisation:
- Quickstart
- RAG based on USN edito
We also recommand you to check out Langchain documentation on how to build custom answer generation chains.
4. What's next ?
You can learn how to improve and run your pipeline by checking out our Quickstart.
We hope you will enjoy it !