Welcome to RaFa, an open-source project that's all about making life easier for those who are looking for a private GPT for their enterprise knowledge base π§
- RaFa: An Open-Source Framework for Enterprise AI π
- A private GPT for your enterprise knowledge base
- A RAG framework that's built on top of open-source LLM's and framework's
- A private GPT that runs securely on your cloud or on-premise infrastructure
- Continously growing RAG framework along with the latest trends in AI
It's designed to be production-ready, so you can deploy RaFa-powered applications on your cloud or on-premise environments. No sweat! πͺ
For non-technical users, simply deploy RaFa on your infrastructure and let the AI agents automatically select the optimal architecture for your needs.
For tech-savvy users, customize your experience by tweaking the config file to your needs. With RaFa, flexibility meets simplicity, empowering users to focus on their specific use-cases without unnecessary complexity.π¨π
RaFa's architecture is designed to be modular and flexible, allowing users to easily integrate new frameworks and layers as needed. The current architecture includes the following components:
Ready to take Rafa for a spin? Check out the installation and usage instructions in the documentation. It's as easy as pie! π₯§
-
Pre-requisites:
-
RaFA is built using Python 3.8 or higher and Poetry for dependence management. Make sure you have Python installed on your system. You can download it from the official Python website.
- Python 3.6 or higher
- Pip (Optional)
- Poetry
-
-
Clone the repository:
Note: You can also download it directly from Github
git clone https://github.com/rajnavakoti/rafa.git
-
Download and Install Ollama:
-
For MAC and Windows users, you can download the latest version of Ollama from the official website.
-
For Linux users, you can Install with a single command:
curl -fsSL https://ollama.com/install.sh | sh
-
-
Install your model:
-
You can install your choice of model using ollama commands. You can find the list of models available on the Ollama website. RaFA by default setup for LLama3_8b:Instruct model
ollama run <model>
-
-
Setup environment:
-
Run RaFA:
- To run RaFA locally, you can use the following command from the root folder of the project:
Note: If you run in to permission issues while running the script, you can run chmod +x run_rafa.sh first to give the script execution permissions.
./run_rafa.sh
-
Pre-requisites:
-
RaFA is built using Python 3.8 or higher and Poetry for dependence management. Make sure you have Python installed on your system. You can download it from the official Python website.
- Python 3.6 or higher
- Pip (Optional)
- Poetry
- Docker
-
-
Clone the repository:
Note: You can also download it directly from Github
git clone https://github.com/rajnavakoti/rafa.git
-
Setup environment:
-
Run docker-compose:
-
To run RaFA using docker, you can use the following command from the root folder of the project:
docker-compose up
-
-
Install your model:
-
You can install your choice of model using ollama commands inside the ollama docker container. You can find the list of models available on the Ollama website. RaFA by default setup for LLama3_8b:Instruct model
1. Get the ollama container id using the following command: docker ps 2. Run the following command to get into the ollama container: docker exec -it <container_id> /bin/bash 3. Run the following command to install the model: ollama run <model>
-
-
Use Streamlit to interact with RaFA:
-
Once the docker-compose is up and running, you can access the Streamlit app by navigating to the following URL in your browser:
http://localhost:8501
-
You can now interact with RaFA using the Streamlit app. Have fun! π
-
You can change the default environment setup by changing the values in the .env file in the root folder. The following are the default values in the .env file:
- Default Environment Variables
| Variable | Description | Default | Options |
|---|---|---|---|
| IS_DOCKER | Choice to run it locally or using docker | commented | comment, uncomment |
| STORAGE_TYPE | Type of storage you would like to store | chromadb | in-memory, chromadb (vector), Neo4j (graph) |
| CHUNK_SIZE | ChunkSize controls the max size of the final documents (in terms of number of characters) | 256 | |
| CHUNK_OVERLAP | ChunkOverlap specifies how much overlap there should be between chunks | 5 | |
| EMBED_MODEL | Text embedding model | bge-small-en-v1.5 | Any model supported by llama-index |
| LLM_MODEL | Your LLM | llama3:instruct_8b | Any model supported by ollama |
| CHAT_MEOMRY_TOKEN_LIMIT | How much chat history you want to include in your prompt | 3900 token | |
| OLLAMA_SERVER_TIMEOUT | How long RaFA should wait for a response from the ollama server | 90sec | |
| CHAT_MODE | llama-index options for your chat engine | condense_plus_context | best, condense_question, context, condense_plus_context, simple, react, openai |
You can change these values to suit your needs. Once you've made the changes, you can run RaFA using the instructions provided above. π
Note: You are also free to add or change any environment variables as needed. Just make sure to update the .env file accordingly. π
If you want to change the defualt prompt templates . you can make in the changes in default_prompt_templates.py file in the rafa folder under prompts directory. Prompt templates depeneds on the model you are using.
1. Use smaller models for testing:
When you're testing RaFA, it's a good idea to use smaller models to speed up the process. You can change the model in the .env file to a smaller one to get faster results.
By default, RaFA uses the llama3:instruct_8b model, which is a large model. With a good local machine configuration, you can use this model without any issues. However, if you're running into performance issues, you can switch to a smaller model like llama3:instruct_1b or llama3:instruct_2b or Phi-3 Mini is a 3.8B model.
2, Check your docker settings:
If you're running RaFA using Docker, make sure you have enough resources allocated to the Docker container. You can adjust the resources in the Docker settings to allocate more CPU and memory to the container.
3. Docker cache issues:
If you're running into cache issues while building the Docker container, you can try running the following command to clear the cache:
```bash
docker-compose build --no-cache
```
This will rebuild the Docker container without using the cache, which can help resolve cache-related issues.
Alternatively, you can all the containers and prune the system to remove all the unused containers and images. You can do this by running the following commands:
```bash
docker compose down
docker prune -a
```
4. Use the Streamlit app for easy interaction:
The Streamlit app provides an easy-to-use interface for interacting with RaFA. You can use the app to input your prompts and get responses from RaFA in real-time. It's a great way to test RaFA and see how it performs with different prompts.
Now everything is set up, you can start using RaFa and build your AI. Here is the way to go π΄
RaFa framework comes with a default AI app 'friday'. It's a simple streamlit web app that helps you get started with RaFa.
You can access the Friday app by navigating to the following URL in your browser: http://localhost:8501
Once you're in the app, you can interact with RaFa by entering your prompts in the input box and clicking the 'Ask' button. RaFa will generate a response based on your prompt and display it in the app. You can continue to ask questions and get responses from RaFa in real-time. It's like having your own private GPT at your fingertips! π
You can also upload your data (documents) using the 'create collection' and 'add document' buttons. This will allow RaFa to generate responses based on the created knowledge base.
Knowledge base is the heart of the RaFa framework. It's where you store all your documents and data that RaFa uses to generate responses to your queries.
The framework is by default using the chromadb for storing the documents. You can upload your documents using the 'create collection' and 'add document' options in the Friday app
Note: Please see the chromadb documentation for better understanding of database, collection and document relationship.
All the file type supported by llama-index (.CSV, .Docx, .epub, .hwp, .ipynb, .jpeg, .jpg, .mbox, .md, .mp3, .mp4, .pdf, .png, .ppt, .pptm, .pptx) are supported by RaFA
Now you are ready to chat with friday. You have 2 toggle options you can choose from:
-
Chat Mode: You can choose the 'chat mode' toogle to switch between QA and chat:
- QA: This mode is used for asking questions and getting answers, no chat history is included in the prompt.
- chat: This mode is used for chat history, where the chat history is included in the prompt and friday will respond based on the chat history.
-
Using knowledge base: You can choose the 'Use knowledge base' toggle to switch between, whether friday has to use the your knowledge base to answer the questions or not.
This is the fun part you are waiting for. sorry for the long wait, it's important to understand the basics before you start building your AI.
Building your own AI app is simple plug and play. The interaction of the framework techinically starts with http proxy serverit's is the entry point.
If you are looking for customizations, you can check the websocker server code in the rafa/websocket_server.py file. You can customize the code to suit your needs and build your own AI app.
RaFa is designed to be cloud-ready, so you can easily deploy it on your favorite cloud provider. You can use the same Docker setup to deploy RaFa on the cloud. Just follow the instructions provided above, and you'll be up and running in no time! π
Sample deployment scripts for popular cloud providers like AWS, GCP, and Azure are coming soon. Stay tuned for more updates! π
We are continuously working on improving RaFa and adding new features. Here are some of the features we are working on:
- Advanced RAG: We are going to add more advanced RAG practices to the framework that can improve the document embedding, retrieval, AI reasoning, self-reflection, and more.
- Kick-ass UI: friday deserves a better design and we are working on it!
- multi model framework
RaFa is an open-source project, and we love getting help from our community. It's like a potluck dinner - the more, the merrier! Check out the contribution guidelines for more information.
RaFa is licensed under MIT. For more details, please refer to the LICENSE file in the repository. It's like the rule book of our potluck dinner. π
So, what are you waiting for? Dive in and start exploring Rafa today! πββοΈπ







