Skip to content

poox41/EAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EKBA - Enterprise Knowledge-Base Assistant

EKBA is a RAG style chatbox application, based on open source OPEA project, to provide enterprise level QnA service with specified private data as the context.

Table of Contents

Overview

Chatbox is one of the most widely adopted use case for leveraging the powerful chat and reasoning capabilities of large language models (LLMs). The retrieval augmented generation (RAG) architecture is quickly becoming the industry standard for chatbox development. It combines the benefits of a knowledge base (via a vector store) and generative models to reduce hallucinations, maintain up-to-date information, and leverage domain-specific knowledge.

RAG bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity.

Deployment

This guide describes the deployment process for setting up the complete system, which consists of two steps:

  1. LLM Inference Service: Provides the language model capabilities for generating responses
  2. Knowledge Base Services: Handles document processing, storage, and retrieval

Note: By default, EKBA is configured with the knowledge base running on CPU, while LLM inference is deployed on HPU (Habana Gaudi). If you need to use different hardware configurations, please modify the corresponding Docker or Helm configuration files and images accordingly.

Deployment Steps

(optional) Step 1: Deploy LLM Inference Service

Deploy the LLM inference service which by default runs on Intel Gaudi HPU. For detailed deployment instructions, please refer to deployment/llm-serving/README.md. But in most cases, the LLM inferenc serving is ready to be used, so user need only to configure the endpoint of it.

Step 2: Deploy Knowledge Base Services

Deploy the knowledge base services which include all the components needed for document processing, storage, and retrieval. You can choose either Docker Compose (deployment/docker-compose/README.md) or Helm deployment (deployment/helm-charts/README.md).

Usage

1. Data Ingestion

To start using the system, first ingest your documents using the dataprep service:

curl -X POST \
    -H "Content-Type: multipart/form-data" \
    -F "files=@<you-file-absolute-path>" \
    -F "collection_name=<your-collection-name>" \
    http://<dataprep-service-ip>:<dataprep-service-port>/v1/dataprep

Replace the following placeholders:

  • <your-file-absolute-path>: File's absolute path you want to ingest
  • <your-collection-name>: Name for your document collection
  • <dataprep-service-ip>: IP address of your dataprep service
  • <dataprep-service-port>: Port number of your dataprep service

For more details about dataprep usage, please refer to src/comps/dataprep/README.md.

2. Knowledge Base Question Answering

You can interact with the EKBA (Enterprise Knowledge Base Assistant) system in two ways:

Option A: Web UI

Access the system through your web browser:

http://<ui-service-ip>:<ui-port>  # Default port is 5174, may vary based on your deployment configuration

Option B: REST API

For Docker Compose Deployment
curl http://<host-ip>:8888/v1/chatqna \
    -H "Content-Type: application/json" \
    -d '{
        "messages": "What is the revenue of Nike in 2023?",
        "k": 10,
        "score_threshold": 0.5,
        "top_n": 3,
        "max_tokens": 16384
    }'
For Kubernetes Deployment

Get the service IP and port for the chatqna service:

# Get chatqna backen service details
export SERVICE_IP=$(kubectl get svc ekba-chatqna -n <your-namespace> -o jsonpath='{.spec.clusterIP}')
export SERVICE_PORT=$(kubectl get svc ekba-chatqna -n <your-namespace> -o jsonpath='{.spec.ports[0].port}')

# Use the service endpoint
curl http://${SERVICE_IP}:${SERVICE_PORT}/v1/chatqna \
    -H "Content-Type: application/json" \
    -d '{
        "messages": "What is deep learning?",
        "k": 10,
        "score_threshold": 0.5,
        "top_n": 3,
        "max_tokens": 16384
    }'

API Parameters:

  • messages: Your question
  • k: Number of initial search results (default: 10)
  • score_threshold: Minimum similarity score threshold (default: 0.5)
  • top_n: Number of final results to return (default: 3)
  • max_tokens: Maximum number of tokens in the response (default: 0)

3. Individual Service Documentation

For detailed API documentation of specific services, refer to their respective README files:

Each service's documentation includes detailed API specifications and usage examples.

Troubleshooting

Enable Debug Logging

To assist with troubleshooting, you can enable detailed logging for all individual services (dataprep, embedding, reranking, retriever, and llm services) by setting LOG_LEVEL=DEBUG before deployment:

  • For Docker Compose deployment: Set in set-env.sh or .env file under the working dir
  • For Helm deployment: Set in ekba-values.yaml file

Services validation testings

Under the directory "scripts/tester", the developer can run docker-compose up to run the test-case runer tool, to confirm whether each EKBA service can work well. The test cases are defined in tests.json at the same dir.

About

企业知识库 RAG 问答系统

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors