Docs/knowledge demo example by dolev31 · Pull Request #156 · cuga-project/cuga-agent

dolev31 · 2026-04-19T08:47:08Z

Summary

This PR adds a new hands-on example under docs/examples/knowledge_demo/ that demonstrates CUGA's knowledge engine end to end through a realistic HR Benefits Assistant scenario.

It also fixes a blocking bug that prevented knowledge usage in SDK mode.

Why

The repo already documents the knowledge engine at a reference level in KNOWLEDGE_PIPELINE.md and the root README.md under the Knowledge Base section, but it does not include a walkable example that shows the difference between agent-level and session-level knowledge in practice.

That distinction is one of the most compelling parts of the feature, yet developers browsing docs/examples/ did not have a concrete example they could copy when integrating knowledge into their own agents.

What is included

This PR adds a fictional HR Benefits Assistant for Acme Corp.

Agent-level documents

Persistent, shared across conversations:

employee_handbook.md
health_insurance_plan.md
pto_policy.md
401k_plan.md

Session-level documents

Ephemeral, per-conversation documents for a single employee, Sarah Chen:

benefits_enrollment.md
pto_balance.md
march_2026_pay_stub.md

All numbers in the example, including IRS limits, accrual rates, and premiums, are realistic for 2026 so the agent's answers are easy to verify.

Integration paths shown in the example

Both paths are documented in the example README.md.

Path A: SDK (`main.py`)

Demonstrates the full programmatic surface using:

CugaAgent(enable_knowledge=True)
agent.knowledge.ingest(...)
agent.knowledge.search(...)
agent.invoke(...)

Path B: UI (`cuga start demo_knowledge`)

Demonstrates the UI flow:

upload agent-level docs through the manage page
publish the agent
attach session-level docs in chat
run demo prompts that require cross-scope reasoning

Demo prompts included

The example README.md includes prompts that exercise different retrieval scopes:

"What's the vacation carryover limit?"
Agent-level only
"How many PTO days do I have left?"
Session-level only
"Based on my pay stub and HSA enrollment, am I on track to max out my HSA?"
Combines agent-level IRS limit and employer contribution rules with Sarah's YTD payroll deductions
"Will I have enough PTO for a 10-day July vacation?"
Combines policy with Sarah's personal balance and accrual rate

Conventions

The example follows the same conventions as:

docs/examples/cuga_as_mcp/
docs/examples/cuga_with_runtime_tools/

Specifically:

path dependency on the root cuga
shared lockfile usage
.env.example
uv run --project ../../../ main.py

Bug fix included

Problem

KnowledgeClient.get_langchain_tools() generated 7 tool wrapper functions whose signatures did not accept a thread_id keyword argument.

At runtime, cuga_lite_graph.py injects thread_id, which caused SDK-mode calls to agent.invoke() with knowledge enabled to crash with:

TypeError: knowledge_search_knowledge() got an unexpected keyword argument 'thread_id'

Replace OpenRAG with a built-in knowledge engine supporting: - Vector store abstraction (SQLite default, Milvus, PGVector) - Docling document parsing (PDF, DOCX, HTML, etc.) - Session-scoped and agent-scoped document management - Knowledge MCP server, REST routes, and frontend panels - Configurable RAG profiles (speed/standard/balanced/max_quality) - Backward-compatible deprecation of OpenRAGClient Includes rebuilt frontend bundles reflecting knowledge UI additions.

Merge main into knowledge feature branch to incorporate: - Out-of-the-box agents (demo_docs, demo_health, oak_health) - Authorization flow improvements - Agent name customization - FastMCP security bump - Docker/deployment updates - Frontend dist relocation to src/cuga/frontend/dist Knowledge features preserved and integrated with new demo presets.

…eddings sentence-transformers/torch were removed from the project in favor of fastembed. Switch knowledge engine to FastEmbedEmbeddings to match.

…he with policies Replace langchain-community FastEmbedEmbeddings with a thin adapter that uses fastembed.TextEmbedding directly and reuses the model cache from embedding_service. Keeps langchain-community for SQLiteVec and Ollama.

…ai/ollama) - fastembed: default, uses fastembed.TextEmbedding directly - openai: requires api_key (config or env), supports custom base_url - ollama: supports custom base_url, defaults to localhost:11434 - Remove huggingface as provider (auto-migrated to fastembed) - Add api_key and base_url config fields for embeddings - Clear error messages when config is missing

…i, ollama Four distinct providers, each logically correct: - fastembed: default, lightweight local embeddings (installed with cuga) - huggingface: sentence-transformers (optional dep, clear install instructions) - openai: API-based, requires api_key config or OPENAI_API_KEY env var - ollama: local server, configurable base_url

…ies)

Create the converter once (lazy) and pass it to DoclingLoader, avoiding model weight reloads on every document ingestion.

…s not installed

…d, knowledge agent ID - Replace Debug (bug) icon with Settings icon for Configuration section - Fix conversation history not loading when clicking previous sessions (pass selectedThreadId to CarbonChat so threadId changes trigger history reload, and keep isReadonly=true so homescreen doesn't override loaded history) - Fix knowledge documents not showing (setKnowledgeAgentId was using display name instead of actual agent ID) - Fix "body stream already read" error (manageRes.json() was called twice) - Restyle right panel section switcher with grounded tab-strip borders

…ead of ~/.cuga/ Move vector DB, metadata, uploaded files, session state, and auth token from ~/.cuga/ (shared home dir) to <cwd>/.cuga/ (per-project isolation), matching how policies already use <cwd>/.cuga/. Also fix test_defaults assertion for embedding_provider (fastembed not auto).

…gine start - Add `cuga start demo_knowledge` CLI mode (knowledge enabled from startup) - Add `cuga start demo_knowledge --reset` to wipe all knowledge data - Add `cuga stop demo_knowledge` handler - Default knowledge to disabled (knowledge_settings.toml, KnowledgeConfig dataclass) - Knowledge engine can be started on-demand via POST /api/knowledge/enable (called when user toggles knowledge ON in the manage UI) - Extract initialize_knowledge_engine() as reusable async function on app_state (full init: engine + session provider + MCP server + token + warmup) - Fix demo restart bug: no longer wipes knowledge data on every demo start (only wipes when --reset explicitly passed) - Fix reset: delete WAL sidecar files, check flock before deleting, log warnings instead of silently swallowing errors - Warn when --reset passed to non-knowledge demos - Fix Dynaconf caching: force enabled=true in saved config for demo_knowledge (env var set after settings import is not picked up by Dynaconf cache)

…agent into feat/knowledge-engine

[skip ci]

# Conflicts: # .gitignore # .secrets.baseline # pyproject.toml # src/cuga/backend/cuga_graph/nodes/chat/chat_agent/chat_agent.py # src/cuga/backend/cuga_graph/nodes/cuga_lite/cuga_lite_graph.py # src/cuga/config.py # src/cuga/frontend/dist/index.html # uv.lock

Adds a new example under docs/examples/knowledge_demo/ that demonstrates the knowledge engine end-to-end with an HR Benefits Assistant scenario: - 4 agent-level sample docs (handbook, health plan, PTO, 401(k)) - 3 session-level sample docs (one employee's enrollment, PTO balance, pay stub) that exercise cross-scope RAG reasoning - main.py showing the programmatic SDK surface (agent.knowledge.ingest/search + agent.invoke) - README covering both Path A (SDK) and Path B (cuga start demo_knowledge) - pyproject.toml + .python-version + .env.example matching the cuga_as_mcp / cuga_with_runtime_tools example convention Also fixes a blocking bug in src/cuga/backend/knowledge/client.py: get_langchain_tools() generated tool functions that didn't accept the `thread_id` kwarg that cuga_lite_graph injects at runtime, which made any SDK-mode knowledge use via agent.invoke() crash with TypeError. Adding **_ to the 7 tool wrappers absorbs the injected kwarg while keeping the JSON schema (and thus the LLM-facing API) unchanged. Root README.md gets two new links to the example (Knowledge Base section + Additional Resources list). .gitignore adds the local-only demo_knowledge_docs/ scratch folder. Verified end-to-end: - 69/69 knowledge unit tests pass - Path A clean run answers "40 hours (5 days)" for PTO carryover, session search returns sarah_chen_pto_balance.md as top hit - Path B: cuga start demo_knowledge boots, /api/knowledge/health reports healthy+ready, HTTP upload + search both succeed

sami-marreed · 2026-05-14T09:54:03Z

Merge `main` and resolve conflicts (blocking)

Please merge or rebase onto the current main, resolve conflicts (this branch has historically overlapped .gitignore, pyproject.toml, cuga_lite_graph.py, client.py, lockfiles, etc.), then push so the PR is mergeable and CI reflects an integrated tree. GitHub currently shows the branch as conflicting with main.

Trim unrelated churn (high priority)

pyproject.toml adds openlit>=1.40.1 to the core dependencies list. On main, OpenLit is already handled via optional observability / overrides — this looks out of scope for a docs + knowledge-tool wrapper fix. Unless there is an explicit decision to make OpenLit a default dependency, drop this hunk (or split to a separate PR with rationale).
src/cuga/frontend/dist/main.*.js (~1 MB bundle) appears in the branch diff vs main. A knowledge demo + KnowledgeClient fix should normally not rebuild/commit frontend artifacts unless something here truly requires it — please remove unless tied to a concrete UI change for this PR.
.gitignore adds docs/examples/demo_apps/file_system/build/ alongside demo_knowledge_docs/. The demo_apps path looks unrelated to the knowledge demo — consider dropping it or moving it to a separate maintenance PR.
cuga_lite_graph.py diff vs main is only extra blank lines. Please revert that noise so history stays clean.

Code fix (`KnowledgeClient.get_langchain_tools`)

Adding **_: Any (with the NOTE comment) so injected kwargs like thread_id do not raise TypeError matches the described SDK/runtime mismatch — sensible given schema stability concerns.

Docs example

The docs/examples/knowledge_demo/ layout (agent vs session docs, dual paths SDK/UI, .env.example explaining Dynaconf) aligns well with the stated goals; once conflicts and unrelated files are cleaned up, this part should be straightforward to land.

dolev31 and others added 30 commits April 5, 2026 17:08

fix: use require_chat_access for get_conversations (from main)

a5612be

fix: remove duplicate effectiveAgentId and rebuild frontend

eaf8f64

fix: use fastembed instead of sentence-transformers for knowledge emb…

d75b4df

…eddings sentence-transformers/torch were removed from the project in favor of fastembed. Switch knowledge engine to FastEmbedEmbeddings to match.

fix: use independent fastembed model instance for knowledge

4525fab

fix: default knowledge embedding provider to fastembed

d4a4a91

fix: default embedding model to BAAI/bge-small-en-v1.5 (matches polic…

11d2864

…ies)

perf: reuse Docling DocumentConverter across document loads

fcf3df6

Create the converter once (lazy) and pass it to DoclingLoader, avoiding model weight reloads on every document ingestion.

fix: auto-fallback huggingface to fastembed when sentence-transformer…

1c81ad3

…s not installed

fix: ruff format and check

505e0b7

fix: address local vs prod integration and move to async for metadata

df51339

fix: address tests

58c35f3

Merge branch 'feat/knowledge-engine' of github.com:cuga-project/cuga-…

a338f9c

…agent into feat/knowledge-engine

fix: fix tests

923683a

fix: address windows issues

482a2ab

fix: restore extension files [skip ci]

b305089

fix: add authorization for manage knowledge endpoint

5f580ab

fix: address review comments

efbb6c9

fix: add demo knowledge option to docker entrypoint

c03dcfb

[skip ci]

fix: address review comments

6267801

fix: handle creation of .cuga folder [skip ci]

4adac77

fix: file permissions

d567a60

[skip ci]

fix: add missing lib [skip ci]

90de13f

sami-marreed and others added 3 commits April 7, 2026 22:36

fix: improve readme

cb76f51

[skip ci]

dolev31 requested a review from sami-marreed April 19, 2026 08:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs/knowledge demo example#156

Docs/knowledge demo example#156
dolev31 wants to merge 33 commits into
mainfrom
docs/knowledge-demo-example

dolev31 commented Apr 19, 2026

Uh oh!

sami-marreed commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dolev31 commented Apr 19, 2026

Summary

Why

What is included

Agent-level documents

Session-level documents

Integration paths shown in the example

Path A: SDK (main.py)

Path B: UI (cuga start demo_knowledge)

Demo prompts included

Conventions

Bug fix included

Problem

Uh oh!

sami-marreed commented May 14, 2026

Merge main and resolve conflicts (blocking)

Trim unrelated churn (high priority)

Code fix (KnowledgeClient.get_langchain_tools)

Docs example

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Path A: SDK (`main.py`)

Path B: UI (`cuga start demo_knowledge`)

Merge `main` and resolve conflicts (blocking)

Code fix (`KnowledgeClient.get_langchain_tools`)