Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: CI

on:
push:
branches: [ main, release/* ]
branches: [ main, release/*, v* ]
pull_request:
branches: [ main ]

Expand Down
19 changes: 17 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.8%2B-blue.svg" alt="Python 3.8+"/></a>
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"/></a>
<a href="https://github.com/yashdesai023/vectorDBpipe/actions"><img src="https://github.com/yashdesai023/vectorDBpipe/actions/workflows/ci.yml/badge.svg" alt="CI"/></a>
<img src="https://img.shields.io/badge/version-0.2.0-brightgreen.svg" alt="Version 0.2.0"/>
<img src="https://img.shields.io/badge/version-0.2.2-brightgreen.svg" alt="Version 0.2.2"/>
<img src="https://img.shields.io/badge/tests-4%20passed-success.svg" alt="Tests 4 passed"/>
<img src="https://img.shields.io/badge/PyPI-vectordbpipe-blueviolet.svg" alt="PyPI"/>
</p>
Expand Down Expand Up @@ -749,7 +749,22 @@ Contributions are warmly welcomed! Please follow these steps:

## 📜 Changelog

### v0.2.0 — Omni-RAG Architecture (February 2026) ⭐ Latest
### v0.2.2 — Critical Hotfix Release (March 2026) ⭐ Latest

> **Hotfix** — Resolves critical pipeline initialization and engine routing bugs affecting all users of `config_override`.

**Fixed:**
- **Embedder `'NoneType' object has no attribute 'tokenize'`** — `TextPipeline` was using the legacy `model.name` config key instead of the new `embedding.model_name`. This caused `SentenceTransformer(None)` to be created, crashing all ingestion and queries. `_safe_reinit` now completely bypasses legacy keys and reinitializes all providers from `embedding`, `database`, and `llm` config directly.
- **LLM not initialized with `config_override`** — Added missing `sarvam`, `google`, and `cohere` LLM provider support to `_safe_reinit`. Sarvam users were silently getting `self.llm = None` even with a valid API key configured.
- **Graph always empty (0 nodes) after ingestion** — Graph extraction was 100% LLM-gated with no fallback. Added `_regex_graph_extract()` that uses regex pattern matching to extract entity relationships (`X is Y`, `X has Y`, etc.) when no LLM is configured.
- **Corrupted PDF crash (`FzErrorFormat`)** — `_load_pdf` now loads pages by index with per-page `try/except`, skipping broken pages gracefully instead of crashing the entire ingestion.
- **Engine 2/3/4 returning "LLM not configured"** — All three engines now return useful, readable fallback content without an LLM. Engine 2 returns formatted PageIndex structure; Engine 3 returns filtered graph edges plus vector search; Engine 4 returns a helpful config snippet.
- **Engine 3 returning irrelevant graph output** — GraphRAG now filters edges by query keywords, shows a clear `"No direct match"` note, and transparently supplements with vector search when the graph has no matching entities.
- **`generate_response()` signature mismatch** — All engine calls now correctly pass the `retrieved_context` argument to the LLM provider interface.

---

### v0.2.0 — Omni-RAG Architecture (February 2026)

> **Major Release** — Complete architectural overhaul introducing the 4-engine Omni-RAG stack.

Expand Down
8 changes: 8 additions & 0 deletions demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -328,7 +328,15 @@
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.7"
}
},
Expand Down
Binary file added demo.pdf
Binary file not shown.
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@

setup(
name="vectordbpipe",
version="0.2.1",
version="0.2.2",
author="Yash Desai",
author_email="desaisyash1000@gmail.com",

# ─── PyPI short description (appears in search results) ───────────────────
description=(
"vectorDBpipe v0.2.1 — Enterprise Omni-RAG SDK. "
"vectorDBpipe v0.2.2 — Enterprise Omni-RAG SDK. "
"Tri-Processing Ingestion + 4 AI Engines (Vector RAG, Vectorless RAG, "
"GraphRAG, Structured JSON Extract) + 15+ data connectors. "
"One pipeline. One API. Zero glue code."
Expand Down
Binary file modified vectorDBpipe/__pycache__/__init__.cpython-310.pyc
Binary file not shown.
Binary file modified vectorDBpipe/__pycache__/__init__.cpython-311.pyc
Binary file not shown.
Binary file modified vectorDBpipe/config/__pycache__/config_manager.cpython-310.pyc
Binary file not shown.
Binary file modified vectorDBpipe/config/__pycache__/config_manager.cpython-311.pyc
Binary file not shown.
Binary file modified vectorDBpipe/data/__pycache__/loader.cpython-310.pyc
Binary file not shown.
Binary file modified vectorDBpipe/data/__pycache__/loader.cpython-311.pyc
Binary file not shown.
15 changes: 13 additions & 2 deletions vectorDBpipe/data/loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,9 +115,20 @@ def _load_txt(self, path: str) -> str:
with open(path, "r", encoding="utf-8", errors="ignore") as f: return f.read()

def _load_pdf(self, path: str) -> str:
import logging
text = ""
with fitz.open(path) as pdf:
for page in pdf: text += page.get_text("text")
try:
pdf = fitz.open(path)
except Exception as e:
logging.warning(f"DataLoader: Could not open PDF '{path}': {e}")
return text
for i in range(len(pdf)):
try:
page = pdf.load_page(i)
text += page.get_text("text")
except Exception as e:
logging.warning(f"DataLoader: Skipping corrupted page {i} in '{path}': {e}")
pdf.close()
return text

def _load_docx(self, path: str) -> str:
Expand Down
Binary file modified vectorDBpipe/embeddings/__pycache__/embedder.cpython-311.pyc
Binary file not shown.
Binary file modified vectorDBpipe/pipeline/__pycache__/text_pipeline.cpython-310.pyc
Binary file not shown.
Binary file modified vectorDBpipe/pipeline/__pycache__/text_pipeline.cpython-311.pyc
Binary file not shown.
Loading