Skip to content

Latest commit

 

History

History
1098 lines (869 loc) · 39.3 KB

File metadata and controls

1098 lines (869 loc) · 39.3 KB

Programming With LightRAG Core

If you want to integrate LightRAG into your project, we recommend using the REST API provided by the LightRAG Server. LightRAG Core is intended for embedded applications or researchers conducting studies and evaluations.

A Simple Program

import os
import asyncio
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete, gpt_4o_complete, openai_embed
from lightrag.utils import setup_logger

setup_logger("lightrag", level="INFO")

WORKING_DIR = "./rag_storage"
if not os.path.exists(WORKING_DIR):
    os.mkdir(WORKING_DIR)

async def initialize_rag():
    rag = LightRAG(
        working_dir=WORKING_DIR,
        embedding_func=openai_embed,
        llm_model_func=gpt_4o_mini_complete,
    )
    # IMPORTANT: Both initialization calls are required!
    await rag.initialize_storages()  # Initialize storage backends
    return rag

async def main():
    try:
        # Initialize RAG instance
        rag = await initialize_rag()
        await rag.ainsert("Your text")

        # Perform hybrid search
        mode = "hybrid"
        print(
          await rag.aquery(
              "What are the top themes in this story?",
              param=QueryParam(mode=mode)
          )
        )

    except Exception as e:
        print(f"An error occurred: {e}")
    finally:
        if rag:
            await rag.finalize_storages()

if __name__ == "__main__":
    asyncio.run(main())

Notes:

  • Export your OPENAI_API_KEY environment variable before running.
  • All data is persisted to WORKING_DIR.

Important:

LightRAG requires explicit initialization before use. You must call await rag.initialize_storages() after creating a LightRAG instance, otherwise you will encounter errors.

LightRAG Init Parameters

Parameters

Parameter Type Explanation Default
working_dir str Directory where the cache will be stored lightrag_cache+timestamp
workspace str Workspace name for data isolation between different LightRAG Instances
kv_storage str Storage type for documents and text chunks. Supported types: JsonKVStorage,PGKVStorage,RedisKVStorage,MongoKVStorage,OpenSearchKVStorage JsonKVStorage
vector_storage str Storage type for embedding vectors. Supported types: NanoVectorDBStorage,PGVectorStorage,MilvusVectorDBStorage,ChromaVectorDBStorage,FaissVectorDBStorage,MongoVectorDBStorage,QdrantVectorDBStorage,OpenSearchVectorDBStorage NanoVectorDBStorage
graph_storage str Storage type for graph edges and nodes. Supported types: NetworkXStorage,Neo4JStorage,PGGraphStorage,AGEStorage,OpenSearchGraphStorage NetworkXStorage
doc_status_storage str Storage type for documents process status. Supported types: JsonDocStatusStorage,PGDocStatusStorage,MongoDocStatusStorage,OpenSearchDocStatusStorage JsonDocStatusStorage
chunk_token_size int Maximum token size per chunk when splitting documents 1200
chunk_overlap_token_size int Overlap token size between two chunks when splitting documents 100
tokenizer Tokenizer The function used to convert text into tokens (numbers) and back using .encode() and .decode() functions following TokenizerInterface protocol. If you don't specify one, it will use the default Tiktoken tokenizer. TiktokenTokenizer
tiktoken_model_name str If you're using the default Tiktoken tokenizer, this is the name of the specific Tiktoken model to use. This setting is ignored if you provide your own tokenizer. gpt-4o-mini
entity_extract_max_gleaning int Number of loops in the entity extraction process, appending history messages 1
node_embedding_algorithm str Algorithm for node embedding (currently not used) node2vec
node2vec_params dict Parameters for node embedding {"dimensions": 1536,"num_walks": 10,"walk_length": 40,"window_size": 2,"iterations": 3,"random_seed": 3,}
embedding_func EmbeddingFunc Function to generate embedding vectors from text openai_embed
embedding_batch_num int Maximum batch size for embedding processes (multiple texts sent per batch) 32
embedding_func_max_async int Maximum number of concurrent asynchronous embedding processes 16
llm_model_func callable Function for LLM generation gpt_4o_mini_complete
llm_model_name str LLM model name for generation meta-llama/Llama-3.2-1B-Instruct
summary_context_size int Maximum tokens send to LLM to generate summaries for entity relation merging 10000(configured by env var SUMMARY_CONTEXT_SIZE)
summary_max_tokens int Maximum token size for entity/relation description 500(configured by env var SUMMARY_MAX_TOKENS)
llm_model_max_async int Maximum number of concurrent asynchronous LLM processes 4(default value changed by env var MAX_ASYNC)
llm_model_kwargs dict Additional parameters for LLM generation
vector_db_storage_cls_kwargs dict Additional parameters for vector database, like setting the threshold for nodes and relations retrieval cosine_better_than_threshold: 0.2(default value changed by env var COSINE_THRESHOLD)
enable_llm_cache bool If TRUE, stores LLM results in cache; repeated prompts return cached responses TRUE
enable_llm_cache_for_entity_extract bool If TRUE, stores LLM results in cache for entity extraction; Good for beginners to debug your application TRUE
addon_params dict Additional parameters, e.g., {"language": "Simplified Chinese", "entity_types": ["organization", "person", "location", "event"]}: sets example limit, entity/relation extraction output language language: English`
embedding_cache_config dict Configuration for question-answer caching. Contains three parameters: enabled: Boolean value to enable/disable cache lookup functionality. When enabled, the system will check cached responses before generating new answers. similarity_threshold: Float value (0-1), similarity threshold. When a new question's similarity with a cached question exceeds this threshold, the cached answer will be returned directly without calling the LLM. use_llm_check: Boolean value to enable/disable LLM similarity verification. When enabled, LLM will be used as a secondary check to verify the similarity between questions before returning cached answers. Default: {"enabled": False, "similarity_threshold": 0.95, "use_llm_check": False}

QueryParam

Use QueryParam to control the behavior of your query:

class QueryParam:
    """Configuration parameters for query execution in LightRAG."""

    mode: Literal["local", "global", "hybrid", "naive", "mix", "bypass"] = "global"
    """Specifies the retrieval mode:
    - "local": Focuses on context-dependent information.
    - "global": Utilizes global knowledge.
    - "hybrid": Combines local and global retrieval methods.
    - "naive": Performs a basic search without advanced techniques.
    - "mix": Integrates knowledge graph and vector retrieval.
    """

    only_need_context: bool = False
    """If True, only returns the retrieved context without generating a response."""

    only_need_prompt: bool = False
    """If True, only returns the generated prompt without producing a response."""

    response_type: str = "Multiple Paragraphs"
    """Defines the response format. Examples: 'Multiple Paragraphs', 'Single Paragraph', 'Bullet Points'."""

    stream: bool = False
    """If True, enables streaming output for real-time responses."""

    top_k: int = int(os.getenv("TOP_K", "60"))
    """Number of top items to retrieve. Represents entities in 'local' mode and relationships in 'global' mode."""

    chunk_top_k: int = int(os.getenv("CHUNK_TOP_K", "20"))
    """Number of text chunks to retrieve initially from vector search and keep after reranking.
    If None, defaults to top_k value.
    """

    max_entity_tokens: int = int(os.getenv("MAX_ENTITY_TOKENS", "6000"))
    """Maximum number of tokens allocated for entity context in unified token control system."""

    max_relation_tokens: int = int(os.getenv("MAX_RELATION_TOKENS", "8000"))
    """Maximum number of tokens allocated for relationship context in unified token control system."""

    max_total_tokens: int = int(os.getenv("MAX_TOTAL_TOKENS", "30000"))
    """Maximum total tokens budget for the entire query context (entities + relations + chunks + system prompt)."""

    # History messages are only sent to LLM for context, not used for retrieval
    conversation_history: list[dict[str, str]] = field(default_factory=list)
    """Stores past conversation history to maintain context.
    Format: [{"role": "user/assistant", "content": "message"}].
    """

    model_func: Callable[..., object] | None = None
    """Optional override for the LLM model function to use for this specific query.
    If provided, this will be used instead of the global model function.
    This allows using different models for different query modes.
    """

    user_prompt: str | None = None
    """User-provided prompt for the query.
    Addition instructions for LLM. If provided, this will be inject into the prompt template.
    It's purpose is the let user customize the way LLM generate the response.
    """

    enable_rerank: bool = True
    """Enable reranking for retrieved text chunks. If True but no rerank model is configured, a warning will be issued.
    Default is True to enable reranking when rerank model is available.
    """

The default value of top_k can be changed by the environment variable TOP_K.

LLM and Embedding Injection

LightRAG requires LLM and Embedding models for document indexing and querying. During initialization, inject the relevant model functions into LightRAG.

Model Selection Requirements

  • LLM: at least 32B parameters, 32KB context (64KB recommended). Avoid reasoning models during indexing; use stronger models at query time.
  • Embedding: must be consistent across indexing and querying. Recommended: BAAI/bge-m3, text-embedding-3-large. Changing models requires clearing vector storage.
  • Reranker: significantly improves retrieval. When enabled, set query mode to mix. Recommended: BAAI/bge-reranker-v2-m3, Jina rerankers.

Using OpenAI-like APIs

LightRAG supports OpenAI-like chat/embeddings APIs:

import os
import numpy as np
from lightrag.utils import wrap_embedding_func_with_attrs
from lightrag.llm.openai import openai_complete_if_cache, openai_embed

async def llm_model_func(
    prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs
) -> str:
    return await openai_complete_if_cache(
        "solar-mini",
        prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key=os.getenv("UPSTAGE_API_KEY"),
        base_url="https://api.upstage.ai/v1/solar",
        **kwargs
    )

@wrap_embedding_func_with_attrs(embedding_dim=4096, max_token_size=8192, model_name="solar-embedding-1-large-query")
async def embedding_func(texts: list[str]) -> np.ndarray:
    return await openai_embed.func(
        texts,
        model="solar-embedding-1-large-query",
        api_key=os.getenv("UPSTAGE_API_KEY"),
        base_url="https://api.upstage.ai/v1/solar"
    )

async def initialize_rag():
    rag = LightRAG(
        working_dir=WORKING_DIR,
        llm_model_func=llm_model_func,
        embedding_func=embedding_func  # Pass the decorated function directly
    )
    await rag.initialize_storages()
    return rag

Important Note on Embedding Function Wrapping:

EmbeddingFunc cannot be nested. Functions decorated with @wrap_embedding_func_with_attrs (such as openai_embed, ollama_embed, etc.) cannot be wrapped again using EmbeddingFunc(). This is why we call xxx_embed.func (the underlying unwrapped function) instead of xxx_embed directly when creating custom embedding functions.

Using Hugging Face Models

See lightrag_hf_demo.py

from functools import partial
from transformers import AutoTokenizer, AutoModel

# Pre-load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")
embed_model = AutoModel.from_pretrained("sentence-transformers/all-MiniLM-L6-v2")

# Initialize LightRAG with Hugging Face model
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=hf_model_complete,  # Use Hugging Face model for text generation
    llm_model_name='meta-llama/Llama-3.1-8B-Instruct',  # Model name from Hugging Face
    # Use Hugging Face embedding function
    embedding_func=EmbeddingFunc(
        embedding_dim=384,
        max_token_size=2048,
        model_name="sentence-transformers/all-MiniLM-L6-v2",
        func=partial(
            hf_embed.func,  # Use .func to access the unwrapped function
            tokenizer=tokenizer,
            embed_model=embed_model
        )
    ),
)

Using Ollama Models

Pull the model you plan to use and an embedding model, for example nomic-embed-text:

import numpy as np
from lightrag.utils import wrap_embedding_func_with_attrs
from lightrag.llm.ollama import ollama_model_complete, ollama_embed

@wrap_embedding_func_with_attrs(embedding_dim=768, max_token_size=8192, model_name="nomic-embed-text")
async def embedding_func(texts: list[str]) -> np.ndarray:
    return await ollama_embed.func(texts, embed_model="nomic-embed-text")

# Initialize LightRAG with Ollama model
rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=ollama_model_complete,
    llm_model_name='your_model_name',
    embedding_func=embedding_func,
)

Increasing context size

LightRAG requires at least 32k context tokens. Ollama defaults to 8k. Two approaches:

Approach 1: Edit Modelfile

ollama pull qwen2
ollama show --modelfile qwen2 > Modelfile
# Add this line to Modelfile:
# PARAMETER num_ctx 32768
ollama create -f Modelfile qwen2m

Approach 2: Set num_ctx via llm_model_kwargs

rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=ollama_model_complete,
    llm_model_name='your_model_name',
    llm_model_kwargs={"options": {"num_ctx": 32768}},
    embedding_func=embedding_func,
)

Important Note on Embedding Function Wrapping:

EmbeddingFunc cannot be nested. Use xxx_embed.func to access the underlying unwrapped function.

Low RAM GPUs

For low-RAM GPUs (e.g. 6GB), select a small model and tune the context window. For example, gemma2:2b with num_ctx=26000 can find ~197 entities and 19 relations on book.txt.

LlamaIndex

LightRAG supports integration with LlamaIndex (llm/llama_index_impl.py):

import asyncio
from lightrag import LightRAG
from lightrag.llm.llama_index_impl import llama_index_complete_if_cache, llama_index_embed
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from lightrag.utils import setup_logger

setup_logger("lightrag", level="INFO")

async def initialize_rag():
    rag = LightRAG(
        working_dir="your/path",
        llm_model_func=llama_index_complete_if_cache,
        embedding_func=EmbeddingFunc(
            embedding_dim=1536,
            max_token_size=2048,
            model_name=embed_model,
            func=partial(llama_index_embed.func, embed_model=embed_model)
        ),
    )
    await rag.initialize_storages()
    return rag

Further reading:

Using Azure OpenAI Models

import os
import numpy as np
from lightrag.utils import wrap_embedding_func_with_attrs
from lightrag.llm.azure_openai import azure_openai_complete_if_cache, azure_openai_embed

async def llm_model_func(
    prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs
) -> str:
    return await azure_openai_complete_if_cache(
        prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        deployment_name=os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"),
        **kwargs
    )

@wrap_embedding_func_with_attrs(
    embedding_dim=1536,
    max_token_size=8192,
    model_name=os.getenv("AZURE_OPENAI_EMBEDDING_MODEL")
)
async def embedding_func(texts: list[str]) -> np.ndarray:
    return await azure_openai_embed.func(
        texts,
        api_key=os.getenv("AZURE_OPENAI_API_KEY"),
        azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
        api_version=os.getenv("AZURE_OPENAI_API_VERSION"),
        deployment_name=os.getenv("AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME")
    )

rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=llm_model_func,
    embedding_func=embedding_func
)

Using Google Gemini Models

import os
import numpy as np
from lightrag.utils import wrap_embedding_func_with_attrs
from lightrag.llm.gemini import gemini_model_complete, gemini_embed

async def llm_model_func(
    prompt, system_prompt=None, history_messages=[], keyword_extraction=False, **kwargs
) -> str:
    return await gemini_model_complete(
        prompt,
        system_prompt=system_prompt,
        history_messages=history_messages,
        api_key=os.getenv("GEMINI_API_KEY"),
        model_name="gemini-2.0-flash",
        **kwargs
    )

@wrap_embedding_func_with_attrs(
    embedding_dim=768,
    max_token_size=2048,
    model_name="models/text-embedding-004"
)
async def embedding_func(texts: list[str]) -> np.ndarray:
    return await gemini_embed.func(
        texts,
        api_key=os.getenv("GEMINI_API_KEY"),
        model="models/text-embedding-004"
    )

rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=llm_model_func,
    llm_model_name="gemini-2.0-flash",
    embedding_func=embedding_func
)

Rerank Function Injection

To enhance retrieval quality, documents can be re-ranked based on a more effective relevance scoring model. The rerank.py file provides three Reranker provider driver functions:

  • Cohere / vLLM: cohere_rerank
  • Jina AI: jina_rerank
  • Aliyun: ali_rerank

Inject one of these functions into the rerank_model_func attribute of the LightRAG object. For detailed usage, refer to examples/rerank_example.py.

User Prompt vs. Query

When using LightRAG for content queries, avoid combining the search process with unrelated output processing, as this significantly impacts query effectiveness. The user_prompt parameter in QueryParam does not participate in the RAG retrieval phase — it guides the LLM on how to process the retrieved results after the query is completed.

query_param = QueryParam(
    mode="hybrid",
    user_prompt="For diagrams, use mermaid format with English/Pinyin node names and Chinese display labels",
)

response_default = rag.query(
    "Please draw a character relationship diagram for Scrooge",
    param=query_param
)
print(response_default)

Storage Backends

Sotrage Types

LightRAG uses 4 types of storage for different purposes:

Storage Type Purpose
KV_STORAGE LLM response cache, text chunks, document information
VECTOR_STORAGE Entity/relation/chunk embedding vectors
GRAPH_STORAGE Entity-relation graph structure
DOC_STATUS_STORAGE Document indexing status

Supported Implementations

KV_STORAGE

JsonKVStorage        JsonFile (default)
PGKVStorage          Postgres
RedisKVStorage       Redis
MongoKVStorage       MongoDB
OpenSearchKVStorage  OpenSearch

GRAPH_STORAGE

NetworkXStorage          NetworkX (default)
Neo4JStorage             Neo4J
PGGraphStorage           PostgreSQL with AGE plugin
MemgraphStorage          Memgraph
OpenSearchGraphStorage   OpenSearch

Testing has shown that Neo4J delivers superior performance in production environments compared to PostgreSQL with AGE plugin.

VECTOR_STORAGE

NanoVectorDBStorage         NanoVector (default)
PGVectorStorage             Postgres
MilvusVectorDBStorage       Milvus
FaissVectorDBStorage        Faiss
QdrantVectorDBStorage       Qdrant
MongoVectorDBStorage        MongoDB
OpenSearchVectorDBStorage   OpenSearch

DOC_STATUS_STORAGE

JsonDocStatusStorage        JsonFile (default)
PGDocStatusStorage          Postgres
MongoDocStatusStorage       MongoDB
OpenSearchDocStatusStorage  OpenSearch

Example connection configurations for each storage type can be found in the repository's env.example file. The database instance in the connection string must be created beforehand — LightRAG only creates tables within the instance, not the instance itself.

Backend-Specific Setup

Using Neo4J Storage

For production level scenarios you will most likely want to leverage an enterprise solution for KG storage. Running Neo4J in Docker is recommended for seamless local testing. See: https://hub.docker.com/_/neo4j

export NEO4J_URI="neo4j://localhost:7687"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="password"
export NEO4J_DATABASE="neo4j"  # Required for community edition
from lightrag.utils import setup_logger

setup_logger("lightrag", level="INFO")

async def initialize_rag():
    rag = LightRAG(
        working_dir=WORKING_DIR,
        llm_model_func=gpt_4o_mini_complete,
        graph_storage="Neo4JStorage",
    )
    await rag.initialize_storages()
    return rag

See test_neo4j.py for a working example.

Using PostgreSQL Storage

PostgreSQL can provide a one-stop solution as KV store, VectorDB (pgvector), and GraphDB (apache AGE). PostgreSQL version 16.6 or higher is supported.

Using Faiss Storage

Before using Faiss, manually install faiss-cpu or faiss-gpu:

pip install faiss-cpu
async def embedding_func(texts: list[str]) -> np.ndarray:
    model = SentenceTransformer('all-MiniLM-L6-v2')
    embeddings = model.encode(texts, convert_to_numpy=True)
    return embeddings

rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=llm_model_func,
    embedding_func=EmbeddingFunc(
        embedding_dim=384,
        max_token_size=2048,
        model_name="all-MiniLM-L6-v2",
        func=embedding_func,
    ),
    vector_storage="FaissVectorDBStorage",
    vector_db_storage_cls_kwargs={
        "cosine_better_than_threshold": 0.3
    }
)

Using Memgraph for Storage

Memgraph is a high-performance, in-memory graph database compatible with the Neo4j Bolt protocol. See: https://memgraph.com/download

export MEMGRAPH_URI="bolt://localhost:7687"
async def initialize_rag():
    rag = LightRAG(
        working_dir=WORKING_DIR,
        llm_model_func=gpt_4o_mini_complete,
        graph_storage="MemgraphStorage",
    )
    await rag.initialize_storages()
    return rag

Using Milvus for Vector Storage

Milvus is a high-performance, scalable vector database for production-level vector storage. For full configuration options including index types (HNSW, HNSW_SQ, IVF, DISKANN, etc.) and metric types, see docs/MilvusConfigurationGuide.md.

Quick setup via environment variables:

MILVUS_URI=http://localhost:19530
MILVUS_DB_NAME=lightrag
LIGHTRAG_VECTOR_STORAGE=MilvusVectorDBStorage

Quick setup via Python SDK:

rag = LightRAG(
    working_dir="./rag_storage",
    llm_model_func=...,
    embedding_func=...,
    vector_storage="MilvusVectorDBStorage",
    vector_db_storage_cls_kwargs={
        "milvus_uri": "http://localhost:19530",
        "milvus_db_name": "lightrag",
        "cosine_better_than_threshold": 0.2,
    },
)

Using MongoDB Storage

MongoDB provides a one-stop storage solution for LightRAG with native KV storage and vector storage. LightRAG uses MongoDB collections to implement a simple graph storage.

MongoVectorDBStorage requires a MongoDB deployment with Atlas Search / Vector Search support (e.g., MongoDB Atlas or Atlas local). The setup wizard's bundled local Docker MongoDB service is MongoDB Community Edition — it can be used for KV/graph/doc-status storage but not for MongoVectorDBStorage.

Using Redis Storage

LightRAG supports Redis as KV storage. Configure persistence and memory usage carefully. Recommended Redis configuration:

save 900 1
save 300 10
save 60 1000
stop-writes-on-bgsave-error yes
maxmemory 4gb
maxmemory-policy noeviction
maxclients 500

When the interactive setup manages a local Redis container, it stages a user-editable config at ./data/config/redis.conf and mounts it into the container. Setup preserves that file on reruns so local Redis tuning can be adjusted without losing manual edits.

Using OpenSearch Storage

OpenSearch provides a unified storage solution for all four LightRAG storage types (KV, Vector, Graph, DocStatus). It offers native k-NN vector search, full-text search, and horizontal scalability without cloud-only restrictions.

Requirements: OpenSearch 3.x or higher with k-NN plugin enabled.

Install with Docker (without plugins):

docker run -d -p 9200:9200 -e "discovery.type=single-node" \
  -e "OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password>" \
  opensearchproject/opensearch:latest

Install with Docker Compose (Recommended, with plugins):

curl -O https://raw.githubusercontent.com/opensearch-project/opensearch-build/main/docker/release/dockercomposefiles/docker-compose-3.x.yml
OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password> docker-compose -f docker-compose-3.x.yml up -d

Configuration (see env.example for full list):

export OPENSEARCH_HOSTS=localhost:9200
export OPENSEARCH_USER=admin
export OPENSEARCH_PASSWORD=<custom-admin-password>
export OPENSEARCH_USE_SSL=true
export OPENSEARCH_VERIFY_CERTS=false

Usage:

rag = LightRAG(
    working_dir=WORKING_DIR,
    llm_model_func=your_llm_func,
    embedding_func=your_embed_func,
    kv_storage="OpenSearchKVStorage",
    doc_status_storage="OpenSearchDocStatusStorage",
    graph_storage="OpenSearchGraphStorage",
    vector_storage="OpenSearchVectorDBStorage",
)

Graph Traversal: When the OpenSearch SQL plugin with PPL support is available, graph queries use server-side BFS via the graphlookup command for optimal performance. Otherwise, it falls back to client-side batched BFS. Auto-detected at startup, or force via OPENSEARCH_USE_PPL_GRAPHLOOKUP=true|false.

Integration Testing:

  1. Start OpenSearch using Docker Compose:
OPENSEARCH_INITIAL_ADMIN_PASSWORD=<custom-admin-password> docker-compose -f docker-compose-3.x.yml up -d
  1. Verify the cluster is running:
curl -sk -u admin:<custom-admin-password> https://localhost:9200
curl -sk -u admin:<custom-admin-password> https://localhost:9200/_cat/plugins?v
  1. Run unit tests (no OpenSearch required — uses mocks):
python -m pytest tests/test_opensearch_storage.py -v
  1. Run the OpenSearch storage demo:
export OPENSEARCH_HOSTS=localhost:9200
export OPENSEARCH_USER=admin
export OPENSEARCH_PASSWORD=<custom-admin-password>
export OPENSEARCH_USE_SSL=true
export OPENSEARCH_VERIFY_CERTS=false
python examples/opensearch_storage_demo.py
  1. Run the full OpenAI + OpenSearch demo (requires OPENAI_API_KEY):
export OPENAI_API_KEY=your-api-key
python examples/lightrag_openai_opensearch_graph_demo.py
  1. Visualize the knowledge graph via LightRAG WebUI:
LIGHTRAG_KV_STORAGE=OpenSearchKVStorage \
LIGHTRAG_DOC_STATUS_STORAGE=OpenSearchDocStatusStorage \
LIGHTRAG_GRAPH_STORAGE=OpenSearchGraphStorage \
LIGHTRAG_VECTOR_STORAGE=OpenSearchVectorDBStorage \
LLM_BINDING=openai \
EMBEDDING_BINDING=openai \
EMBEDDING_MODEL=text-embedding-3-large \
EMBEDDING_DIM=3072 \
OPENAI_API_KEY=your-api-key \
lightrag-server

Data Isolation Between LightRAG Instances

The workspace parameter ensures data isolation between different LightRAG instances. Once initialized, the workspace is immutable.

Storage Type Isolation Method
JsonKVStorage, JsonDocStatusStorage, NetworkXStorage, NanoVectorDBStorage, FaissVectorDBStorage Workspace subdirectories
RedisKVStorage, MilvusVectorDBStorage, MongoKVStorage, MongoVectorDBStorage, MongoGraphStorage, PGGraphStorage Workspace prefix on collection name
QdrantVectorDBStorage Payload-based partitioning (Qdrant multitenancy)
PGKVStorage, PGVectorStorage, PGDocStatusStorage workspace field in tables
Neo4JStorage Labels
OpenSearch* Index name prefixes

Legacy compatibility: Default workspace for PostgreSQL non-graph storage is default; for PostgreSQL AGE graph storage is null; for Neo4j graph storage is base.

Storage-specific workspace environment variables override the common WORKSPACE variable: REDIS_WORKSPACE, MILVUS_WORKSPACE, QDRANT_WORKSPACE, MONGODB_WORKSPACE, POSTGRES_WORKSPACE, NEO4J_WORKSPACE, OPENSEARCH_WORKSPACE.

For a practical demonstration of managing multiple isolated knowledge bases, see Workspace Demo.

Insert

  • Basic Insert
rag.insert("Text")
  • Batch Insert
# Basic Batch Insert
rag.insert(["TEXT1", "TEXT2", ...])

# Batch Insert with custom batch size
rag = LightRAG(
    ...
    working_dir=WORKING_DIR,
    max_parallel_insert=4
)
rag.insert(["TEXT1", "TEXT2", "TEXT3", ...])  # Processed in batches of 4

The max_parallel_insert parameter determines the number of documents processed concurrently. Default is 2. Recommended to keep below 10, as the bottleneck typically lies with the LLM.

  • Insert with ID

The number of documents and IDs must be the same.

# Single text with ID
rag.insert("TEXT1", ids=["ID_FOR_TEXT1"])

# Multiple texts with IDs
rag.insert(["TEXT1", "TEXT2", ...], ids=["ID_FOR_TEXT1", "ID_FOR_TEXT2"])
  • Insert using Pipeline

apipeline_enqueue_documents and apipeline_process_enqueue_documents allow incremental insertion of documents in the background while the main thread continues executing.

rag = LightRAG(..)
await rag.apipeline_enqueue_documents(input)
# Your routine in loop
await rag.apipeline_process_enqueue_documents(input)
  • Insert Multi-file Type Support

The textract library supports reading TXT, DOCX, PPTX, CSV, and PDF:

import textract

file_path = 'TEXT.pdf'
text_content = textract.process(file_path)
rag.insert(text_content.decode('utf-8'))
  • Citation Functionality

By providing file paths, the system ensures sources can be traced back to their original documents:

documents = ["Document content 1", "Document content 2"]
file_paths = ["path/to/doc1.txt", "path/to/doc2.txt"]

rag.insert(documents, file_paths=file_paths)

Edit Entities and Relations

LightRAG supports comprehensive knowledge graph management: create, edit, and delete entities and relationships.

  • Create Entities and Relations
# Create entity
entity = rag.create_entity("Google", {
    "description": "Google is a multinational technology company specializing in internet-related services and products.",
    "entity_type": "company"
})

product = rag.create_entity("Gmail", {
    "description": "Gmail is an email service developed by Google.",
    "entity_type": "product"
})

# Create relation
relation = rag.create_relation("Google", "Gmail", {
    "description": "Google develops and operates Gmail.",
    "keywords": "develops operates service",
    "weight": 2.0
})
  • Edit Entities and Relations
# Edit entity attributes
updated_entity = rag.edit_entity("Google", {
    "description": "Google is a subsidiary of Alphabet Inc., founded in 1998.",
    "entity_type": "tech_company"
})

# Rename entity (with all its relationships properly migrated)
renamed_entity = rag.edit_entity("Gmail", {
    "entity_name": "Google Mail",
    "description": "Google Mail (formerly Gmail) is an email service."
})

# Edit relation
updated_relation = rag.edit_relation("Google", "Google Mail", {
    "description": "Google created and maintains Google Mail service.",
    "keywords": "creates maintains email service",
    "weight": 3.0
})

All operations are available in both synchronous and asynchronous versions. Async versions have the prefix "a" (e.g., acreate_entity, aedit_relation).

  • Insert Custom KG
custom_kg = {
    "chunks": [
        {
            "content": "Alice and Bob are collaborating on quantum computing research.",
            "source_id": "doc-1",
            "file_path": "test_file",
        }
    ],
    "entities": [
        {
            "entity_name": "Alice",
            "entity_type": "person",
            "description": "Alice is a researcher specializing in quantum physics.",
            "source_id": "doc-1",
            "file_path": "test_file"
        },
        {
            "entity_name": "Bob",
            "entity_type": "person",
            "description": "Bob is a mathematician.",
            "source_id": "doc-1",
            "file_path": "test_file"
        },
        {
            "entity_name": "Quantum Computing",
            "entity_type": "technology",
            "description": "Quantum computing utilizes quantum mechanical phenomena for computation.",
            "source_id": "doc-1",
            "file_path": "test_file"
        }
    ],
    "relationships": [
        {
            "src_id": "Alice",
            "tgt_id": "Bob",
            "description": "Alice and Bob are research partners.",
            "keywords": "collaboration research",
            "weight": 1.0,
            "source_id": "doc-1",
            "file_path": "test_file"
        },
        {
            "src_id": "Alice",
            "tgt_id": "Quantum Computing",
            "description": "Alice conducts research on quantum computing.",
            "keywords": "research expertise",
            "weight": 1.0,
            "source_id": "doc-1",
            "file_path": "test_file"
        },
        {
            "src_id": "Bob",
            "tgt_id": "Quantum Computing",
            "description": "Bob researches quantum computing.",
            "keywords": "research application",
            "weight": 1.0,
            "source_id": "doc-1",
            "file_path": "test_file"
        }
    ]
}

rag.insert_custom_kg(custom_kg)
  • Other Entity and Relation Operations
    • create_entity: Creates a new entity with specified attributes
    • edit_entity: Updates an existing entity's attributes or renames it
    • create_relation: Creates a new relation between existing entities
    • edit_relation: Updates an existing relation's attributes

These operations maintain data consistency across both the graph database and vector database components.

Delete Functions

LightRAG provides comprehensive deletion capabilities.

Delete Entities

# Synchronous
rag.delete_by_entity("Google")

# Asynchronous
await rag.adelete_by_entity("Google")

When deleting an entity:

  • Removes the entity node from the knowledge graph
  • Deletes all associated relationships
  • Removes related embedding vectors from the vector database
  • Maintains knowledge graph integrity

Delete Relations

# Synchronous
rag.delete_by_relation("Google", "Gmail")

# Asynchronous
await rag.adelete_by_relation("Google", "Gmail")

When deleting a relationship:

  • Removes the specified relationship edge
  • Deletes the relationship's embedding vector
  • Preserves both entity nodes and their other relationships

Delete by Document ID

# Asynchronous only (complex reconstruction process)
await rag.adelete_by_doc_id("doc-12345")

The deletion process:

  1. Delete all text chunks related to the document
  2. Identify and delete entities/relationships that belong only to this document
  3. Rebuild entities/relationships that still exist in other documents
  4. Update all related vector indexes
  5. Clean up document status records

Important Reminders:

  1. All deletion operations are irreversible — use with caution
  2. Deleting large amounts of data may take time, especially deletion by document ID
  3. Deletion operations automatically maintain consistency between the graph and vector databases
  4. Consider backing up data before performing important deletions

Entity Merging

Merge Entities and Their Relationships

# Basic merge
rag.merge_entities(
    source_entities=["Artificial Intelligence", "AI", "Machine Intelligence"],
    target_entity="AI Technology"
)

# With custom merge strategy
rag.merge_entities(
    source_entities=["John Smith", "Dr. Smith", "J. Smith"],
    target_entity="John Smith",
    merge_strategy={
        "description": "concatenate",  # Combine all descriptions
        "entity_type": "keep_first",   # Keep the type from the first entity
        "source_id": "join_unique"     # Combine all unique source IDs
    }
)

# With custom target entity data
rag.merge_entities(
    source_entities=["New York", "NYC", "Big Apple"],
    target_entity="New York City",
    target_entity_data={
        "entity_type": "LOCATION",
        "description": "New York City is the most populous city in the United States.",
    }
)

# Advanced: combining both strategy and custom data
rag.merge_entities(
    source_entities=["Microsoft Corp", "Microsoft Corporation", "MSFT"],
    target_entity="Microsoft",
    merge_strategy={
        "description": "concatenate",
        "source_id": "join_unique"
    },
    target_entity_data={
        "entity_type": "ORGANIZATION",
    }
)

When merging entities:

  • All relationships from source entities are redirected to the target entity
  • Duplicate relationships are intelligently merged
  • Self-relationships (loops) are prevented
  • Source entities are removed after merging
  • Relationship weights and attributes are preserved

Troubleshooting

Common Initialization Errors

  1. AttributeError: __aenter__

    • Cause: Storage backends not initialized
    • Solution: Call await rag.initialize_storages() after creating the LightRAG instance
  2. KeyError: 'history_messages'

    • Cause: Pipeline status not initialized
    • Solution: Call await rag.initialize_storages() after creating the LightRAG instance
  3. Both errors in sequence

    • Solution: Always follow this pattern:
    rag = LightRAG(...)
    await rag.initialize_storages()

Model Switching Issues

When switching between different embedding models, you must clear the data directory to avoid errors. The only file you may want to preserve is kv_store_llm_response_cache.json if you wish to retain the LLM cache.