feat(extensions): add semantic memory extension for agent knowledge sharing by Subhajitdas99 · Pull Request #421 · GetBindu/Bindu

Subhajitdas99 · 2026-03-30T07:15:26Z

Summary

Problem: Bindu agents have no built-in mechanism to store and retrieve shared knowledge across conversations or agent boundaries.
Why it matters: Cross-agent knowledge sharing is foundational to the Internet of Agents vision — agents should learn and remember.
What changed: Added bindu/extensions/semantic_memory/ with a class-based store, vector embeddings via OpenRouter, and cosine similarity retrieval.
What did NOT change: No core Bindu code touched. Pure extension addition.

Addresses all review feedback from @chandan-1427 on #350.

Change Type

Feature

Scope

Extensions (DID, x402, etc.)

Linked Issue/PR

Split from Enhancement: Improve stream_message reliability with retry logic and safer error handling #350 as requested by @chandan-1427
Related to feat(examples): add collaborative agents demo — Internet of Agents with semantic memory #391 (collaborative agents example uses this pattern)

Changes from original #350 implementation

Issue flagged by @chandan-1427	Fix applied
Global `MEMORY_STORE = []` unsafe across async workers	Replaced with `SemanticMemoryStore` class — each instance has isolated storage
`[0.0] * 1536` dummy fallback causes silent failures	Removed — now raises `ImportError`, `ValueError`, or `RuntimeError` with clear messages
`openai` as core dependency	Added as optional: `pip install bindu[semantic-memory]`

API

from bindu.extensions.semantic_memory.memory_store import SemanticMemoryStore
from bindu.extensions.semantic_memory.embeddings import get_embedding
from bindu.extensions.semantic_memory.retriever import query_memory

# Each agent gets its own isolated store
store = SemanticMemoryStore()

# Store knowledge
embedding = get_embedding("Bindu enables the Internet of Agents.")
store.add("Bindu enables the Internet of Agents.", embedding)

# Retrieve by meaning (not keyword)
results = query_memory("What does Bindu do?", store)
# results[0].text -> "Bindu enables the Internet of Agents."

Installation

pip install bindu[semantic-memory]

Security Impact

New permissions? No
Secrets handling? No — API key via env var only
New network calls? Yes — outbound only to OpenRouter embeddings API
DB changes? No — in-memory only
Auth changes? No

Checklist

No global state — class-based isolated store
No silent fallbacks — all errors raised explicitly
openai is optional dependency in pyproject.toml
Full docstrings on all public functions
Backward compatible — pure addition

Summary by CodeRabbit

New Features
- Added semantic memory extension enabling vector-based knowledge storage and retrieval
- Supports querying stored information using similarity matching
- Integrates with OpenAI/OpenRouter for embedding generation
- Install with pip install bindu[semantic-memory]

coderabbitai · 2026-03-30T07:15:46Z

📝 Walkthrough

Walkthrough

A new semantic memory extension package is introduced with vector-based storage and retrieval functionality. The package includes embedding generation via OpenAI API, in-memory entry storage, cosine similarity-based querying, and an optional dependency group for required libraries.

Changes

Cohort / File(s)	Summary
Semantic Memory Extension `bindu/extensions/semantic_memory/__init__.py`, `bindu/extensions/semantic_memory/embeddings.py`, `bindu/extensions/semantic_memory/memory_store.py`, `bindu/extensions/semantic_memory/retriever.py`	Introduces semantic memory extension with embedding generation via OpenAI API (`text-embedding-3-small` model), in-memory storage via `SemanticMemoryStore` and `MemoryEntry`, and cosine similarity-based retrieval with configurable threshold and top-k filtering.
Optional Dependencies `pyproject.toml`	Adds `semantic-memory` optional dependency group including `openai>=1.0.0` and `numpy>=1.24.0`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant QueryFunc as query_memory()
    participant EmbedFunc as get_embedding()
    participant OpenAI as OpenAI API
    participant Store as SemanticMemoryStore

    Client->>QueryFunc: query_memory(query_text, store, top_k, threshold)
    QueryFunc->>Store: get_all()
    Store-->>QueryFunc: [MemoryEntry, ...]
    
    QueryFunc->>EmbedFunc: get_embedding(query_text)
    EmbedFunc->>OpenAI: embeddings request
    OpenAI-->>EmbedFunc: embedding vector
    EmbedFunc-->>QueryFunc: query_embedding
    
    Note over QueryFunc: For each stored entry:<br/>compute cosine_similarity()<br/>filter by threshold<br/>sort by score desc
    
    QueryFunc-->>Client: top_k results [MemoryEntry, ...]

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A garden of vectors takes shape today,
Embeddings dance in the semantic way!
From OpenAI's wisdom to cosine's delight,
Memory stores knowledge, retrieval so bright! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding a semantic memory extension for agent knowledge sharing, matching the PR's core objective.
Description check	✅ Passed	The description comprehensively covers all template sections: summary with problem/rationale/changes, change type, scope, linked issues, security impact, and verification checklist. All required sections are present and substantive.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

bindu/extensions/semantic_memory/__init__.py (1)

25-31: Consider sorting __all__ for consistency.

Ruff flags that __all__ is not sorted. This is a minor style preference but aligns with isort conventions.

Suggested fix

 __all__ = [
+    "MemoryEntry",
     "SemanticMemoryStore",
-    "MemoryEntry",
+    "cosine_similarity",
     "get_embedding",
     "query_memory",
-    "cosine_similarity",
 ]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bindu/extensions/semantic_memory/__init__.py` around lines 25 - 31, The
__all__ list in bindu/extensions/semantic_memory/__init__.py is not sorted;
update the __all__ declaration so the exported names are alphabetically ordered
(e.g., "cosine_similarity", "get_embedding", "MemoryEntry",
"SemanticMemoryStore", "query_memory") to satisfy Ruff/isort style; locate the
__all__ variable and reorder the string entries accordingly.

bindu/extensions/semantic_memory/retriever.py (2)

28-29: Minor: Replace EN DASH with HYPHEN-MINUS in comments/docstrings.

Ruff flags ambiguous EN DASH characters (–) that should be standard hyphens (-). This prevents potential encoding issues.

Suggested fix

-# Minimum cosine similarity to consider a memory relevant (0.0 – 1.0)
+# Minimum cosine similarity to consider a memory relevant (0.0 - 1.0)

And in the docstring at line 66:

-        threshold: Minimum cosine similarity score (0.0–1.0).
+        threshold: Minimum cosine similarity score (0.0-1.0).

Also applies to: 66-66

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bindu/extensions/semantic_memory/retriever.py` around lines 28 - 29, Replace
the ambiguous EN DASH characters with standard hyphen-minus in
comments/docstrings: update the comment for DEFAULT_SIMILARITY_THRESHOLD and the
docstring at the referenced location (around line with the docstring) to use "-"
instead of "–" so encoding/linters stop flagging them; search for the EN DASH
glyph in retriever.py and replace each occurrence with the ASCII hyphen-minus.

43-43: Add strict=True to zip() to catch dimension mismatches.

If embeddings have different dimensions (e.g., mixing models), zip() silently truncates to the shorter length, producing incorrect similarity scores. Adding strict=True will raise ValueError on length mismatch, making such bugs visible.

Suggested fix

-    dot = sum(x * y for x, y in zip(a, b))
+    dot = sum(x * y for x, y in zip(a, b, strict=True))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bindu/extensions/semantic_memory/retriever.py` at line 43, The line computing
the dot product uses zip(a, b) which silently truncates mismatched embedding
lengths; change it to zip(a, b, strict=True) in retriever.py (the expression
"dot = sum(x * y for x, y in zip(a, b))") and, if desired, wrap that operation
in a try/except catching ValueError to re-raise a clearer error that names the
embeddings or model source so dimension mismatches are surfaced immediately.

bindu/extensions/semantic_memory/embeddings.py (1)

62-66: Consider making the embedding model configurable.

The model text-embedding-3-small is hardcoded. For flexibility, consider adding a model parameter with this as the default, allowing users to switch models without code changes.

Suggested signature change

-def get_embedding(text: str, api_key: str | None = None, base_url: str | None = None) -> List[float]:
+def get_embedding(
+    text: str,
+    api_key: str | None = None,
+    base_url: str | None = None,
+    model: str = "text-embedding-3-small",
+) -> List[float]:

Then use model=model in the embeddings.create() call.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@bindu/extensions/semantic_memory/embeddings.py` around lines 62 - 66, The
embedding model is hardcoded in the client.embeddings.create call; make it
configurable by adding a model parameter (defaulting to
"text-embedding-3-small") to the surrounding function or initializer and pass
that parameter into client.embeddings.create as model=model so callers can
override the embedding model without changing code.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bindu/extensions/semantic_memory/memory_store.py`:
- Around line 29-38: The docstring for class SemanticMemoryStore incorrectly
claims "Thread-safe" while the implementation uses a plain list; update the
class docstring to remove or revise the "Thread-safe" claim and clearly state
that the current in-memory store is safe for single-threaded/async usage but is
not safe for concurrent access across threads, and mention how to make it
thread-safe (e.g., wrap accesses with threading.Lock) or recommend using a
vector DB for production persistence; reference the class name
SemanticMemoryStore and any mentions of MEMORY_STORE in the docstring to locate
and change the wording.

---

Nitpick comments:
In `@bindu/extensions/semantic_memory/__init__.py`:
- Around line 25-31: The __all__ list in
bindu/extensions/semantic_memory/__init__.py is not sorted; update the __all__
declaration so the exported names are alphabetically ordered (e.g.,
"cosine_similarity", "get_embedding", "MemoryEntry", "SemanticMemoryStore",
"query_memory") to satisfy Ruff/isort style; locate the __all__ variable and
reorder the string entries accordingly.

In `@bindu/extensions/semantic_memory/embeddings.py`:
- Around line 62-66: The embedding model is hardcoded in the
client.embeddings.create call; make it configurable by adding a model parameter
(defaulting to "text-embedding-3-small") to the surrounding function or
initializer and pass that parameter into client.embeddings.create as model=model
so callers can override the embedding model without changing code.

In `@bindu/extensions/semantic_memory/retriever.py`:
- Around line 28-29: Replace the ambiguous EN DASH characters with standard
hyphen-minus in comments/docstrings: update the comment for
DEFAULT_SIMILARITY_THRESHOLD and the docstring at the referenced location
(around line with the docstring) to use "-" instead of "–" so encoding/linters
stop flagging them; search for the EN DASH glyph in retriever.py and replace
each occurrence with the ASCII hyphen-minus.
- Line 43: The line computing the dot product uses zip(a, b) which silently
truncates mismatched embedding lengths; change it to zip(a, b, strict=True) in
retriever.py (the expression "dot = sum(x * y for x, y in zip(a, b))") and, if
desired, wrap that operation in a try/except catching ValueError to re-raise a
clearer error that names the embeddings or model source so dimension mismatches
are surfaced immediately.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b85132fc-939f-41c5-930a-bd7de4f8e265

📥 Commits

Reviewing files that changed from the base of the PR and between a864dae and c626d1b.

📒 Files selected for processing (5)

bindu/extensions/semantic_memory/__init__.py
bindu/extensions/semantic_memory/embeddings.py
bindu/extensions/semantic_memory/memory_store.py
bindu/extensions/semantic_memory/retriever.py
pyproject.toml

coderabbitai · 2026-03-30T07:20:46Z

bindu/extensions/semantic_memory/memory_store.py

+class SemanticMemoryStore:
+    """Thread-safe in-memory store for agent knowledge.
+
+    Uses instance-level storage instead of a global list, making it
+    safe for use in async web workers where a shared global MEMORY_STORE
+    would be visible across all requests and agents.
+
+    For production use with persistence across restarts or workers,
+    replace this with a vector database backend (e.g. pgvector, Qdrant).
+    """


⚠️ Potential issue | 🟡 Minor

Misleading "Thread-safe" claim in docstring.

The docstring claims this is "Thread-safe" but the implementation uses a plain list without synchronization primitives. While list.append() happens to be atomic in CPython due to the GIL, this is an implementation detail and not guaranteed. The store is safe for async (single-threaded) usage, but the docstring should be corrected to avoid confusion.

Suggested docstring fix

class SemanticMemoryStore: - """Thread-safe in-memory store for agent knowledge. + """In-memory store for agent knowledge. Uses instance-level storage instead of a global list, making it - safe for use in async web workers where a shared global MEMORY_STORE - would be visible across all requests and agents. + safe for use within a single async context where a shared global + MEMORY_STORE would be visible across all requests and agents. + + Note: This implementation is not thread-safe. For multi-threaded + usage, wrap access with appropriate synchronization. For production use with persistence across restarts or workers, replace this with a vector database backend (e.g. pgvector, Qdrant). """

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@bindu/extensions/semantic_memory/memory_store.py` around lines 29 - 38, The docstring for class SemanticMemoryStore incorrectly claims "Thread-safe" while the implementation uses a plain list; update the class docstring to remove or revise the "Thread-safe" claim and clearly state that the current in-memory store is safe for single-threaded/async usage but is not safe for concurrent access across threads, and mention how to make it thread-safe (e.g., wrap accesses with threading.Lock) or recommend using a vector DB for production persistence; reference the class name SemanticMemoryStore and any mentions of MEMORY_STORE in the docstring to locate and change the wording.

raahulrahl · 2026-03-30T08:00:19Z

We dont need this - lets discuss before making any changes.

feat(extensions): add semantic memory extension with proper isolation

c626d1b

coderabbitai bot reviewed Mar 30, 2026

View reviewed changes

raahulrahl added the not needed label Mar 30, 2026

raahulrahl closed this Mar 30, 2026

Subhajitdas99 deleted the feat/semantic-memory-extension branch March 30, 2026 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(extensions): add semantic memory extension for agent knowledge sharing#421

feat(extensions): add semantic memory extension for agent knowledge sharing#421
Subhajitdas99 wants to merge 1 commit intoGetBindu:mainfrom
Subhajitdas99:feat/semantic-memory-extension

Subhajitdas99 commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 30, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 30, 2026

Uh oh!

raahulrahl commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Subhajitdas99 commented Mar 30, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type

Scope

Linked Issue/PR

Changes from original #350 implementation

API

Installation

Security Impact

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

raahulrahl commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Subhajitdas99 commented Mar 30, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 30, 2026 •

edited

Loading