diff --git a/content/docs/architecture-section/overview.mdx b/content/docs/architecture-section/overview.mdx index 2d42e0b..4e096f9 100644 --- a/content/docs/architecture-section/overview.mdx +++ b/content/docs/architecture-section/overview.mdx @@ -76,9 +76,3 @@ The architecture follows a modular, graph-like structure that ensures task relia * **Human Oversight** – Critical decisions require human validation to prevent errors. * **State Recovery** – System can resume from any point if interrupted. * **Performance Monitoring** – Real-time metrics ensure optimal execution across web and API environments. - ---- - -👉 Next step could be to include an **inline Mermaid diagram** inside the README, so that the architecture is rendered directly on GitHub instead of just in the SVG. - -Want me to add that Mermaid diagram block so the README is fully self-contained? diff --git a/content/docs/customization/authentication.mdx b/content/docs/customization/authentication.mdx new file mode 100644 index 0000000..20c1edc --- /dev/null +++ b/content/docs/customization/authentication.mdx @@ -0,0 +1,86 @@ +--- +title: Authentication & Authorization +description: Optional OIDC/BFF authentication and role-based authorization for the CUGA server. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; + +CUGA's demo server is unauthenticated by default. For shared or multi-user deployments, you can enable OpenID Connect (OIDC) authentication using a Backend-for-Frontend (BFF) session cookie, optionally combined with role-based authorization. + +The full option list lives in the [Settings reference — Auth section](/docs/customization/settings-reference#auth). + +## Quick enable + +```toml +[auth] +enabled = true +authorization_enabled = true +manage_roles = ["ServiceOwner", "ServiceAdmin"] +chat_roles = ["ServiceOwner", "ServiceAdmin", "ServiceUser"] +session_cookie_name = "cuga_session" +session_max_age = 3600 +require_https = true +``` + +Then provide the OIDC client details via environment variables (none of them belong in `settings.toml`): + +```bash +export OIDC_ISSUER="https://issuer.example.com" +export OIDC_CLIENT_ID="cuga" +export OIDC_CLIENT_SECRET="..." +export OIDC_REDIRECT_URI="https://cuga.example.com/auth/callback" +``` + +## Authentication vs authorization + +| Setting | Effect | +|---------|--------| +| `enabled = true` | Users must log in via the IdP. Anonymous traffic is rejected. | +| `authorization_enabled = true` | Roles in `manage_roles` / `chat_roles` are enforced for protected endpoints. | +| `enabled = true`, `authorization_enabled = false` | Authenticated users can use the agent regardless of role. | + +### Where roles come from + +`role_token_source` controls which token CUGA inspects for the user's roles claim: + +| Value | Used when | +|-------|-----------| +| `"auto"` (default) | CUGA inspects the access token first, then falls back to the id_token, then the IAM proxy header. | +| `"id_token"` | Force roles to come from the OIDC id_token. | +| `"access_token"` | Force roles to come from the OIDC access token. | +| `"iam_proxy"` | Trust an upstream IAM proxy header (for deployments fronted by IBM Cloud / OpenShift IAM). | + +## Behind an IAM proxy + +```toml +[auth] +enabled = true +authorization_enabled = true +iam_proxy_url = "https://iam-proxy.internal" +iam_proxy_skip_verify = false +iam_proxy_ca_bundle = "/etc/cuga/iam-proxy-ca.pem" +role_token_source = "iam_proxy" +``` + +`iam_proxy_ca_bundle` and `OIDC_CA_BUNDLE` are independent — set both if your proxy and IdP use different internal CAs. + +## TLS termination + +When CUGA terminates TLS itself (i.e. there's no reverse proxy): + +```toml +[auth] +require_https = true +ssl_keyfile = "/etc/cuga/tls/key.pem" +ssl_certfile = "/etc/cuga/tls/cert.pem" +``` + +In Kubernetes / Ingress / OpenShift Route deployments leave these empty and let the platform handle TLS. + +## Optional: profile-token authorization workflow + +Combined with the [authorization workflow](https://github.com/cuga-project/cuga-agent) (cuga-agent PRs #60 and #92), authenticated users can opt-in to attach their own profile token to outbound tool calls. This lets the agent act _as_ the user when calling APIs that require user-level credentials, while still gating which tools are reachable via `manage_roles` / `chat_roles`. + + +Always set `require_https = true` (or terminate TLS upstream) when authentication is on — the BFF session cookie must never travel over plaintext. + diff --git a/content/docs/customization/context-summarization.mdx b/content/docs/customization/context-summarization.mdx new file mode 100644 index 0000000..87abb96 --- /dev/null +++ b/content/docs/customization/context-summarization.mdx @@ -0,0 +1,62 @@ +--- +title: Context Summarization +description: Automatically summarize older messages when the context window fills up — for both CugaAgent and CugaSupervisor. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; + +For long conversations, CUGA can roll older turns into a running summary so the LLM keeps the most useful context without blowing the window. + +The full option list lives in the [Settings reference — Context Summarization](/docs/customization/settings-reference#context-summarization). + +## Enable + +```toml +[context_summarization] +enabled = true +keep_last_n_messages = 10 +trim_tokens_to_summarize = 500 +summarization_model = "gpt-4o-mini" +trigger_fraction = 0.75 +``` + +With this configuration: + +- Summarization fires when the prompt would exceed **75 %** of the model's context window. +- The **last 10 messages** are always preserved verbatim. +- Older messages are condensed into ~**500 tokens** by `gpt-4o-mini`. + +## Trigger options + +You can use any combination of the three trigger conditions; whichever fires first wins. + +| Trigger | Use when | +|---------|----------| +| `trigger_fraction = 0.75` | You want the trigger to track the model's actual context window — recommended for production. | +| `trigger_tokens = 2000` | You want a fixed token cap regardless of model. | +| `trigger_messages = 20` | You want to summarize after a fixed number of turns (useful for testing). | + +If you set more than one, the **first** condition that becomes true triggers summarization. + +## Custom prompt + +By default LangChain's built-in summarization prompt is used. To override: + +```toml +[context_summarization] +custom_summary_prompt = "Provide a concise summary of the following conversation, preserving all numeric values and named entities: {messages}" +``` + +The `{messages}` placeholder is the only required variable. + +## Choice of summarization model + +`summarization_model` is independent of the agent's main model. Most users keep it on a small/cheap model (`gpt-4o-mini`, `claude-haiku`, etc.) — the goal is fast, lossy compression, not high reasoning. + +## Works with CugaSupervisor + +Context summarization applies to both `CugaAgent` and `CugaSupervisor` runs. Each delegated sub-agent invocation gets the summarized history just like a standalone agent. + + +Summarization is lossy by design. If your task depends on remembering every literal detail (e.g. exact figures from a document), prefer the [Knowledge Base](/docs/customization/knowledge) — it keeps the original document available for retrieval. + diff --git a/content/docs/customization/evolve.mdx b/content/docs/customization/evolve.mdx new file mode 100644 index 0000000..94a7d08 --- /dev/null +++ b/content/docs/customization/evolve.mdx @@ -0,0 +1,125 @@ +--- +title: Evolve Integration +description: Bring task-specific guidelines into CugaLite from altk-evolve, and save trajectories back after every run. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; + +[altk-evolve](https://pypi.org/project/altk-evolve/) is an Anthropic-style "tip generation" service that learns guidelines from past trajectories and surfaces them at the start of similar future tasks. CUGA can use Evolve in **CugaLite mode** to: + +- Inject task-specific guidelines into the system prompt before execution. +- Save the user/assistant trajectory after the run so future tasks benefit from what worked (or failed). + +The full settings list is in the [Settings reference — Evolve](/docs/customization/settings-reference#evolve). + +## How Evolve runs + +You have two options for how the Evolve MCP server starts: + + + +Let the CUGA MCP registry launch Evolve for you. In the Manager UI, add an MCP tool with: + +- **Name**: `evolve` +- **Connection type**: `Command (stdio)` +- **Command**: `uvx` +- **Args**: `--from altk-evolve --with setuptools<70 evolve-mcp` + +Add these env values in the same MCP tool UI: + +```bash +EVOLVE_BACKEND=postgres +EVOLVE_PG_HOST=localhost +EVOLVE_PG_PORT=5432 +EVOLVE_PG_USER=postgres +EVOLVE_PG_PASSWORD=postgres +EVOLVE_PG_DBNAME=evolve +EVOLVE_MODEL_NAME=Azure/gpt-4o +OPENAI_API_KEY=env://OPENAI_API_KEY +OPENAI_BASE_URL=env://OPENAI_BASE_URL +``` + +The `env://VAR` placeholders tell CUGA to read the actual values from its own environment at runtime. + +In `settings.toml`, leave `mode = "auto"` (or set `mode = "registry"`) and set `app_name = "evolve"`. + + +Run Evolve yourself as an SSE server (useful for debugging): + +```bash +# From a checkout of altk-evolve: +uv sync --extra pgvector +evolve-mcp --transport sse --port 8201 +``` + +In `settings.toml`: + +```toml +[evolve] +enabled = true +url = "http://127.0.0.1:8201/sse" +mode = "direct" +``` + +`mode = "direct"` skips registry lookup entirely. + + + +## Enable in `settings.toml` + +```toml +[advanced_features] +lite_mode = true # Evolve only runs for CugaLite + +[evolve] +enabled = true +url = "http://127.0.0.1:8201/sse" +mode = "auto" +app_name = "evolve" +lite_mode_only = true +save_on_success = true +save_on_failure = true +async_save = true +timeout = 30.0 +``` + +## Try it + +```bash +cuga start demo_crm --sample-memory-data +``` + +Then run a CugaLite task, e.g.: + +``` +Identify the common cities between my cuga_workspace/cities.txt and cuga_workspace/company.txt +``` + +## What happens during a run + +1. CUGA derives a task description from the current sub-task (or the first user message). +2. CugaLite asks Evolve for relevant guidelines. +3. Returned guidelines are appended to the system prompt under an `Evolve Guidelines` section. +4. The task executes normally. +5. The user/assistant trajectory is saved back to Evolve after completion. + +## Tuning + +| Setting | Effect | +|---------|--------| +| `async_save = true` | Save in the background; doesn't block the response. | +| `save_on_success` | Only persist successful runs. | +| `save_on_failure` | Only persist failed runs. | +| `mode = "auto"` | Try registry first, fall back to direct SSE. | +| `mode = "registry"` | Force registry-managed Evolve. | +| `mode = "direct"` | Skip registry lookup; use `url`. | +| `lite_mode_only = true` | Disable Evolve for non-lite paths. | + + +If Evolve is unavailable, times out, or returns no guidance, CUGA continues normally — Evolve never blocks task execution. + + + +If you use Evolve's tip generation, make sure the Evolve MCP server's environment includes the required model settings (e.g. `EVOLVE_MODEL_NAME`, OpenAI/LiteLLM credentials). Otherwise `save_trajectory` may fail later with a model-access error even if the MCP connection itself works. + diff --git a/content/docs/customization/knowledge.mdx b/content/docs/customization/knowledge.mdx new file mode 100644 index 0000000..ad63cd0 --- /dev/null +++ b/content/docs/customization/knowledge.mdx @@ -0,0 +1,124 @@ +--- +title: Knowledge Base +description: Self-contained document ingestion and retrieval for CUGA agents using Docling and local vector stores. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; + +CUGA includes a built-in knowledge base powered by LangChain and local vector stores. **Docling** is integrated for document ingestion: it parses and normalizes PDFs, Office files, HTML, Markdown, images, and other supported types before chunking and embedding, so the pipeline stays self-contained with no external document services. + +When enabled, the agent can search, ingest, and manage documents — and it automatically becomes aware of what documents are available. + +## Enabling Knowledge + +Knowledge is **enabled by default** via `settings.toml` (see [Storage](/docs/customization/settings-reference#storage) for the embedding provider). To opt out for a specific agent in the SDK: + +```python +from cuga import CugaAgent + +agent = CugaAgent(tools=[...], enable_knowledge=False) +``` + +The SDK auto-injects knowledge tools and an awareness block into the agent prompt, so the agent knows what documents are available and how to search them. + +## Try the Demo + +```bash +cuga start demo_knowledge +``` + +This is the same surface as `cuga start demo_crm` but with the knowledge engine on — you can upload documents through the UI and query them. + +## Programmatic Access + +```python +from cuga import CugaAgent +import asyncio + +agent = CugaAgent(enable_knowledge=True) + +async def main(): + # Ingest a document + await agent.knowledge.ingest("/path/to/quarterly_report.pdf") + + # The agent now automatically knows about this document + result = await agent.invoke("What does the report say about Q4 revenue?") + print(result.answer) + + # Direct search (skip the agent loop) + results = await agent.knowledge.search("Q4 revenue figures") + for r in results: + print(f"{r['filename']} (page {r['page']}): {r['text'][:100]}") + + # List documents + docs = await agent.knowledge.list_documents() + + # Clean up + await agent.aclose() + +asyncio.run(main()) +``` + +## Scopes + +Documents can be **agent-scoped** (the default — permanent and shared across conversations) or **session-scoped** (tied to a single thread). + + + +```python +# Permanent, shared across conversations +await agent.knowledge.ingest("/path/to/file.pdf", scope="agent") + +results = await agent.knowledge.search("query", scope="agent") +``` + + +```python +thread_id = "user-session-123" + +# Temporary, per-conversation +await agent.knowledge.ingest( + "/path/to/file.pdf", + scope="session", + thread_id=thread_id, +) + +results = await agent.knowledge.search( + "query", + scope="session", + thread_id=thread_id, +) +``` + + + +## Supported Document Types + +PDF, DOCX, XLSX, PPTX, HTML, Markdown, images, and more — anything Docling can parse. + +## Storage and Embeddings + +The knowledge backend is selected by the global `[storage].mode` setting: + +| Data | `mode = "local"` | `mode = "prod"` | +|------|------------------|-----------------| +| Knowledge vectors | `{knowledge.persist_dir}/knowledge_vectors.db` (vec0 tables per collection) | `storage.postgres_url` (pgvector) | +| Knowledge metadata | `{knowledge.persist_dir}/metadata.db` | Postgres tables `cuga_knowledge_meta_*`. Uploaded files still live under `persist_dir/files/`. | + +Embeddings are configured under `[storage.embedding]` and default to a local `BAAI/bge-small-en-v1.5` model (no OpenAI key required). See [Storage](/docs/customization/settings-reference#storage) for full options. + +The knowledge persistence directory defaults to `/.cuga/knowledge/` and can be overridden in `knowledge_settings.toml`. + +## Routing Knowledge Through CugaLite + +The `[advanced_features].force_lite_mode_apps` list defaults to `["knowledge"]`, so knowledge queries always run through CugaLite's faster execution path regardless of `lite_mode_tool_threshold`. To change this, edit `settings.toml`: + +```toml +[advanced_features] +force_lite_mode_apps = ["knowledge", "crm"] # add more apps as needed +``` + + +The agent's awareness block is rebuilt as documents are ingested or removed, so newly added documents are usable immediately on the next invocation. + diff --git a/content/docs/customization/llm-config.mdx b/content/docs/customization/llm-config.mdx index f002a08..19c8915 100644 --- a/content/docs/customization/llm-config.mdx +++ b/content/docs/customization/llm-config.mdx @@ -91,7 +91,7 @@ MODEL_NAME="gpt-4o" 1. Add to your `.env` file: ```bash # For Groq - # GENT_SETTING_CONFIG="settings.groq.toml" + # AGENT_SETTING_CONFIG="settings.groq.toml" # GROQ_API_KEY="XXXX" ``` diff --git a/content/docs/customization/memory.mdx b/content/docs/customization/memory.mdx index 5f0c193..f7bffc8 100644 --- a/content/docs/customization/memory.mdx +++ b/content/docs/customization/memory.mdx @@ -3,6 +3,16 @@ title: Memory & Learning description: Enable CUGA's memory system to learn from past interactions and improve over time --- +import { Callout } from 'fumadocs-ui/components/callout'; + + +The `mem0`-based memory system documented on this page (`enable_memory`, `enable_fact`, `memory_provider`, `memory` server port) was **removed from CUGA classic** in cuga-agent PR #153 (2026-04-23, _\"feat: remove memory support for cuga classic\"_). The settings still appear in older `settings.toml` files but no longer have any effect. + +Trajectory-based learning is now provided by the **[Evolve integration](/docs/customization/settings-reference#evolve)** for CugaLite, and per-conversation context is managed by **[Context Summarization](/docs/customization/settings-reference#context-summarization)**. Document- and knowledge-aware behavior is provided by the new **[Knowledge Base](/docs/customization/knowledge)** (Docling-powered). + +This page is kept as-is for users still running older CUGA versions. + + CUGA's memory system allows the agent to learn from past interactions, remember patterns, and improve performance on similar tasks over time. This creates a personalized, adaptive agent experience. ## Overview diff --git a/content/docs/customization/meta.json b/content/docs/customization/meta.json index 1abae1a..bd968db 100644 --- a/content/docs/customization/meta.json +++ b/content/docs/customization/meta.json @@ -11,6 +11,14 @@ "special-instructions", "tools", "cli-sdk", + "knowledge", + "context-summarization", + "evolve", + "storage", + "secrets-vault", + "authentication", + "observability", + "ui-branding", "memory", "e2b-sandbox", "settings-reference" diff --git a/content/docs/customization/observability.mdx b/content/docs/customization/observability.mdx new file mode 100644 index 0000000..5b71335 --- /dev/null +++ b/content/docs/customization/observability.mdx @@ -0,0 +1,53 @@ +--- +title: Observability (OpenLit) +description: OpenTelemetry-based LLM tracing, metrics, and logs via OpenLit. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; + +CUGA can emit OpenTelemetry traces, metrics, and logs for every LLM call using [OpenLit](https://github.com/openlit/openlit) — a drop-in OTel instrumentation for popular LLM SDKs. + +## Install + +OpenLit ships as an optional extra: + +```bash +pip install "cuga[observability]" +# or with uv: +uv sync --group observability +``` + +## Configure + +```toml +[observability] +openlit = true +``` + +Point OpenLit at your OTLP collector via environment: + +```bash +export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318" +``` + +Common service-identifying env vars (`OTEL_SERVICE_NAME`, `OTEL_RESOURCE_ATTRIBUTES`) work as usual — CUGA does not override them. + +## Local testing stack + +The cuga-agent repo ships a docker-compose stack under `deployment/docker-compose/openlit/` containing an OTel Collector, Tempo (traces), Prometheus (metrics), and Grafana (UI). Start it, point CUGA at the collector, run a task, and you'll see per-call traces with prompt, model, token counts, latency, and cost. + +## What gets captured + +For each LLM invocation OpenLit records: + +- **Trace span** — start/end, duration, parent-child relationships across planner / shortlister / coder / reflection nodes. +- **Attributes** — model name, provider, temperature, prompt/response content (off by default — configure per OpenLit's docs), token usage (input/output/total), and any raised exception. +- **Metrics** — request counts, token counts, and latency histograms exported via OTLP. + +## Combining with Langfuse + +`langfuse_tracing = true` under `[advanced_features]` is independent of OpenLit and can be enabled in parallel — useful when you want both an OTel-native pipeline and a Langfuse dashboard. + + +OpenLit's instrumentation is opt-in per LLM SDK. CUGA enables instrumentation for the providers it ships with (OpenAI, LiteLLM, WatsonX). If you wire in a custom provider, follow [OpenLit's instrumentation docs](https://docs.openlit.io/) to enable it explicitly. + diff --git a/content/docs/customization/secrets-vault.mdx b/content/docs/customization/secrets-vault.mdx new file mode 100644 index 0000000..799fd5f --- /dev/null +++ b/content/docs/customization/secrets-vault.mdx @@ -0,0 +1,100 @@ +--- +title: Secrets & Vault +description: Resolve secrets from environment variables or HashiCorp Vault — with KV v1/v2 and Kubernetes auth. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; +import { Tab, Tabs } from 'fumadocs-ui/components/tabs'; + +CUGA reads secrets at runtime from one of two backends: + +1. **Local** — environment variables (with optional UI overrides stored encrypted on disk). +2. **Vault** — HashiCorp Vault KV v1 or v2, with token or Kubernetes auth. + +The backend is selected via `[secrets].mode` in `settings.toml`. See the full option list in the [Settings reference](/docs/customization/settings-reference#secrets). + +## Local mode (default) + +```toml +[secrets] +mode = "local" +force_env = true +db_encryption_key_env = "CUGA_SECRET_KEY" +``` + +When `force_env = true`, CUGA always resolves from `os.environ` and ignores any UI overrides. Set `CUGA_SECRET_KEY` in the environment to a stable encryption key — it is used to encrypt UI-provided overrides on disk when `force_env = false`. + +## Vault mode + +### Token auth + +```toml +[secrets] +mode = "vault" +vault_addr = "https://vault.example.com:8200" +vault_auth_method = "token" +vault_token_env = "VAULT_TOKEN" +vault_mount = "secret" +vault_kv_version = "" # empty = KV v2 +vault_secret_path = "cuga/prod" +``` + +Then export the token: + +```bash +export VAULT_TOKEN="hvs.CAESI..." +``` + +### Kubernetes auth + +When CUGA runs in a Kubernetes pod, use the projected service-account JWT: + +```toml +[secrets] +mode = "vault" +vault_addr = "https://vault.example.com:8200" +vault_auth_method = "kubernetes" +vault_k8s_role = "cuga" +vault_k8s_mount_path = "kubernetes" +vault_k8s_jwt_path = "/var/run/secrets/kubernetes.io/serviceaccount/token" +vault_mount = "secret" +vault_secret_path = "cuga/prod" +``` + +The auth method, role, and secret path can also be set at runtime via `DYNACONF_SECRETS__VAULT_AUTH_METHOD` and `DYNACONF_SECRETS__VAULT_SECRET_PATH`. + +### TLS + +If your Vault server uses an internal CA: + +```toml +vault_cacert = "/etc/cuga/vault-root-ca.pem" +vault_skip_verify = false +``` + +`VAULT_CACERT` and `VAULT_SKIP_VERIFY` env vars also work. **Do not** disable verification in production. + +### Writing secrets back to Vault + +By default, CUGA reads secrets only: + +```toml +vault_write_enabled = false +``` + +Set to `true` only if you intend to manage secrets through CUGA's UI — most deployments should leave this off. + +## Referencing env-resolved secrets + +When configuring tools (e.g. an Evolve MCP server), pass `env://VAR_NAME` placeholders so values are read from the process environment at runtime: + +```bash +OPENAI_API_KEY=env://OPENAI_API_KEY +OPENAI_BASE_URL=env://OPENAI_BASE_URL +``` + +This pattern works whether secrets ultimately come from `os.environ` or are injected by Vault. + + +Never commit secrets to `settings.toml` or to git. Use environment variables, Vault, or your deployment's secret manager. + diff --git a/content/docs/customization/settings-reference.mdx b/content/docs/customization/settings-reference.mdx index b89640a..c5c09e4 100644 --- a/content/docs/customization/settings-reference.mdx +++ b/content/docs/customization/settings-reference.mdx @@ -75,7 +75,6 @@ Core feature configuration. ```toml [features] cuga_mode = "balanced" -memory_provider = "mem0" ``` ### Options @@ -83,7 +82,8 @@ memory_provider = "mem0" | Option | Type | Default | Description | |--------|------|---------|-------------| | `cuga_mode` | String | `"balanced"` | Execution reasoning mode. Options: `"fast"`, `"balanced"`, `"accurate"`, `"save_reuse_fast"`, `"custom"`. Fast is quicker but less accurate; Accurate is slower but more precise; Save & Reuse caches workflows. | -| `memory_provider` | String | `"mem0"` | Memory system provider. Currently supports `"mem0"`. Used for learning from past errors and improving performance. | + +> **Note:** The legacy `memory_provider` key (mem0) was removed from CUGA classic in cuga-agent PR #153. See [Memory & Learning](/docs/customization/memory) for details and migration guidance. --- @@ -96,6 +96,7 @@ Advanced configuration flags for specialized behavior. # Benchmark and Evaluation web_arena_eval = false benchmark = "default" +appworld_final_answer_plain = true # Vision and Analysis use_vision = true @@ -116,13 +117,18 @@ use_extension = false # Planning and Optimization code_planner_enabled = true api_planner_hitl = false +reflection_enabled = false lite_mode = true lite_mode_tool_threshold = 70 +force_lite_mode_apps = ["knowledge"] shortlisting_tool_threshold = 35 - -# Memory and Learning -enable_memory = false -enable_fact = false +cuga_lite_enable_few_shots = true +cuga_lite_max_steps = 70 +cuga_lite_bind_tools_mode = "none" +cuga_lite_bind_tools_apps = [] +cuga_lite_bind_tools_include_find_tools = false +cuga_lite_nl_auto_continue = false +enable_todos = false # Workflows save_reuse_generate_html = false @@ -137,9 +143,19 @@ e2b_sandbox_ttl_buffer = 60 e2b_cleanup_on_create = true e2b_cleanup_frequency = 0 -# Limits +# Variable Lifecycle +sub_task_keep_last_n = 5 +code_executor_keep_last_n = -1 + +# Limits & Timeouts message_window_limit = 100 max_input_length = 5000 +tool_call_timeout = 30 +execution_output_max_length = 70000 + +# Misc +path_segment_index = 1 +force_autonomous_mode = false ``` ### Benchmark & Evaluation Options @@ -148,6 +164,7 @@ max_input_length = 5000 |--------|------|---------|-------------| | `web_arena_eval` | Boolean | `false` | Enable WebArena benchmark evaluation mode. For testing on WebArena benchmark suite. | | `benchmark` | String | `"default"` | Benchmark mode. Options: `"default"`, `"appworld"`, `"webarena"`. Controls evaluation settings. | +| `appworld_final_answer_plain` | Boolean | `true` | When `benchmark = "appworld"`, use plain `answer:` completion prompts (no JSON) for final formatting. | ### Vision & Analysis Options @@ -183,16 +200,18 @@ max_input_length = 5000 |--------|------|---------|-------------| | `code_planner_enabled` | Boolean | `true` | Enable code generation planning. Controls whether CUGA generates Python code for complex operations. | | `api_planner_hitl` | Boolean | `false` | Enable Human-in-the-Loop for API planner. Pauses at decision points requiring human approval. See [Human-in-the-Loop](/docs/guides/human-in-the-loop). | +| `reflection_enabled` | Boolean | `false` | Run an extra reflection pass after planning/execution to detect and recover from errors. | | `lite_mode` | Boolean | `true` | Enable CugaLite mode for simple API tasks. Automatically routes simple tasks to faster execution path. | | `lite_mode_tool_threshold` | Integer | `70` | Tool count threshold for CugaLite routing. If app has fewer than this many tools, use CugaLite. | -| `shortlisting_tool_threshold` | Integer | `35` | Threshold for enabling tool shortlisting. If total tools exceed this, enable intelligent tool filtering. | - -### Memory & Learning Options - -| Option | Type | Default | Description | -|--------|------|---------|-------------| -| `enable_memory` | Boolean | `false` | Enable memory system. Learn from past errors and improve over time. Requires `uv sync --group memory`. See [Memory](/docs/customization/memory). | -| `enable_fact` | Boolean | `false` | Enable fact checking. Verify agent outputs against known facts. | +| `force_lite_mode_apps` | Array<String> | `["knowledge"]` | App names that always run in CugaLite regardless of `lite_mode_tool_threshold` (e.g. `["knowledge", "crm"]`). | +| `shortlisting_tool_threshold` | Integer | `35` | Threshold for enabling tool shortlisting. If total tools exceed this, enable intelligent tool filtering (`find_tools`). | +| `cuga_lite_enable_few_shots` | Boolean | `true` | MCP few-shots: prompt block + few-shot chat prefix in CugaLite. Set `false` to disable. | +| `cuga_lite_max_steps` | Integer | `70` | Maximum number of steps (call_model + sandbox cycles) in CugaLite before returning an error. | +| `cuga_lite_bind_tools_mode` | String | `"none"` | How CugaLite binds tools to the model. Options: `"none"`, `"all"`, `"apps"`. (Per-model overrides live in `model_runtime_profile.py`.) | +| `cuga_lite_bind_tools_apps` | Array<String> | `[]` | When `cuga_lite_bind_tools_mode = "apps"`, list of app names to bind (e.g. `["crm", "slack"]`). | +| `cuga_lite_bind_tools_include_find_tools` | Boolean | `false` | When binding tools, also bind the `find_tools` StructuredTool alongside `all`/`apps`. | +| `cuga_lite_nl_auto_continue` | Boolean | `false` | When the model returns NL with no code, classify interim vs final; if interim, simulate a user `continue` and re-call the model. | +| `enable_todos` | Boolean | `false` | Enable the todos feature for managing complex multi-step tasks. | ### Workflow Options @@ -215,12 +234,28 @@ See [E2B Cloud Sandbox](/docs/customization/e2b-sandbox) for detailed E2B config | `e2b_cleanup_on_create` | Boolean | `true` | Run cleanup when creating new sandboxes. Prevents sandbox accumulation. | | `e2b_cleanup_frequency` | Integer | `0` | Check all sandboxes every N get_or_create calls. 0 = only on create. Higher values reduce cleanup overhead. | -### Limit Options +### Variable Lifecycle Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `sub_task_keep_last_n` | Integer | `5` | Number of most recent generated variables to keep when executing sub-tasks. | +| `code_executor_keep_last_n` | Integer | `-1` | Variables retained after code execution. `-1` keeps all; positive integers keep the last N. | + +### Limit & Timeout Options | Option | Type | Default | Description | |--------|------|---------|-------------| | `message_window_limit` | Integer | `100` | Maximum messages to keep in conversation history. Older messages discarded when exceeded. Reduces context size. | | `max_input_length` | Integer | `5000` | Maximum character length for user input. Prevents abuse and excessive processing. | +| `tool_call_timeout` | Integer (seconds) | `30` | Timeout for tool/API calls (sandbox operations). Raises `TimeoutError` when exceeded. | +| `execution_output_max_length` | Integer | `70000` | Maximum characters returned in execution output. Prevents token overflow on very large tool responses. | + +### Misc Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `path_segment_index` | Integer | `1` | Which path segment to use for OpenAPI operation naming (1 = first, 2 = second, etc.). | +| `force_autonomous_mode` | Boolean | `false` | Force fully autonomous execution (no HITL prompts) regardless of other settings. | --- @@ -232,17 +267,20 @@ Controls service ports and URLs for all CUGA services. [server_ports] registry = 8001 demo = 7860 +demo_server_startup_max_retries = 420 apis_url = 9000 crm_api = 8007 saved_flows = 8003 environment_url = 8000 +filesystem_mcp = 8112 +docs_mcp = 8113 digital_sales_api = 8000 mcp_server = 8000 petstore_api = 8081 graph_visualization = 8080 orchestrate_url = 4321 trm_url = 8080 -memory = 8888 +oak_health_api = 8090 ``` ### Options @@ -251,17 +289,22 @@ memory = 8888 |--------|------|---------|-------------| | `registry` | Integer | `8001` | API registry service port. Where CUGA tools and APIs are exposed. | | `demo` | Integer | `7860` | CUGA demo interface port. Open browser to http://localhost:7860 | +| `demo_server_startup_max_retries` | Integer | `420` | CLI `cuga start` polls the demo Uvicorn process every ~0.5s up to this many times before timing out (default ≈ 3 minutes). | | `apis_url` | Integer | `9000` | APIs service port. (Rarely used) | -| `crm_api` | Integer | `8007` | CRM demo application port. Used in demo_crm mode. | +| `crm_api` | Integer | `8007` | CRM demo application port. Used in `cuga start demo_crm`. | | `saved_flows` | Integer | `8003` | Saved workflows service port. For Save & Reuse feature. | | `environment_url` | Integer | `8000` | Environment service port. Base configuration service. | +| `filesystem_mcp` | Integer | `8112` | Filesystem MCP server port. Used in the [Filesystem MCP demo](/docs/guides/filesystem-mcp-demo). | +| `docs_mcp` | Integer | `8113` | Docs MCP server port. | | `digital_sales_api` | Integer | `8000` | Digital Sales API port. Used in digital sales demo. | | `mcp_server` | Integer | `8000` | MCP server port for tool integration. | | `petstore_api` | Integer | `8081` | Petstore demo API port. Example API for testing. | | `graph_visualization` | Integer | `8080` | Graph visualization service port. For execution flow visualization. | | `orchestrate_url` | Integer | `4321` | Orchestration service port. (Enterprise only) | | `trm_url` | Integer | `8080` | Task/Routing/Management URL port. (Advanced) | -| `memory` | Integer | `8888` | Memory service port. Used when memory is enabled. | +| `oak_health_api` | Integer | `8090` | `cuga-oak-health` OpenAPI port. Used by `cuga start demo_health`. | + +> **Note:** The `memory = 8888` port has been removed. Memory support for CUGA classic was deprecated in cuga-agent PR #153. ### Advanced Port Configuration @@ -280,6 +323,324 @@ For E2B cloud sandbox, configure registry exposure: --- +## Supervisor + +Configures the multi-agent supervisor when running CUGA as a server. SDK-only usage (building a `CugaSupervisor` in Python) does not require this section. + +```toml +[supervisor] +enabled = false +config_path = "src/cuga/backend/tools_env/registry/config/supervisor_demo_crm.yaml" +agent_approval = true +pass_variables_a2a = false +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `enabled` | Boolean | `false` | Enable `CugaSupervisor` in the server. See [`CugaSupervisor` SDK doc](/docs/sdk/cuga_supervisor). | +| `config_path` | String | `""` | Path to the supervisor YAML config. If empty, uses the default supervisor setup. | +| `agent_approval` | Boolean | `true` | Require user approval before delegating to any sub-agent (human-in-the-loop). | +| `pass_variables_a2a` | Boolean | `false` | When `true`, the A2A delegate tool accepts variables and sends them in request metadata (A2A protocol extension). | + +The bundled multi-agent demo can be launched with: + +```bash +cuga start demo_supervisor +``` + +--- + +## Storage + +Selects the backend for policy vectors, knowledge vectors, and knowledge metadata. + +```toml +[storage] +mode = "local" +local_db_path = "" +postgres_url = "" + +[storage.embedding] +provider = "local" +model = "BAAI/bge-small-en-v1.5" +dim = 384 +base_url = "" +api_key = "" +``` + +### Storage modes + +| Data | `local` | `prod` | +|------|---------|--------| +| Policy vectors | sqlite-vec at `[policy].policy_db_path` or `storage.local_db_path` (defaults to `DBS_DIR/cuga.db`) | `storage.postgres_url` (pgvector) | +| Knowledge vectors | `{knowledge.persist_dir}/knowledge_vectors.db` (vec0 tables per collection) | `storage.postgres_url` | +| Knowledge metadata | `{knowledge.persist_dir}/metadata.db` | Postgres tables `cuga_knowledge_meta_*` (uploaded files stay under `persist_dir/files/`) | + +`DBS_DIR` defaults to the package `dbs/` directory or the value of the `CUGA_DBS_DIR` env var. + +### `[storage]` options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `mode` | String | `"local"` | `"local"` (sqlite/sqlite-vec) or `"prod"` (Postgres + pgvector). | +| `local_db_path` | String | `""` | Override path for the local SQLite DB. Empty = `DBS_DIR/cuga.db`. | +| `postgres_url` | String | `""` | Postgres connection URL. **Required** when `mode = "prod"`. | + +### `[storage.embedding]` options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `provider` | String | `"local"` | `"openai"`, `"local"`, or `"auto"` (tries OpenAI, falls back to local). | +| `model` | String | `"BAAI/bge-small-en-v1.5"` | Embedding model name. | +| `dim` | Integer | `384` | Embedding dimension. `1536` for OpenAI, `384` for `BAAI/bge-small-en-v1.5`. | +| `base_url` | String | `""` | Optional custom endpoint for an OpenAI-compatible embedding service. | +| `api_key` | String | `""` | Optional API key for the embedding endpoint. Falls back to `OPENAI_API_KEY`. | + +--- + +## Policy + +Configures the [policy system](/docs/sdk/policies). + +```toml +[policy] +enabled = true +collection_name = "cuga_policies" +policy_db_path = "" +playbook_refine = false +filesystem_sync = true +cuga_folder = ".cuga" +auto_load_policies = true +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `enabled` | Boolean | `true` | Enable the policy system (intent guards, playbooks, tool guides, tool approvals, output formatters). | +| `collection_name` | String | `"cuga_policies"` | Vector store collection name for policies. | +| `policy_db_path` | String | `""` | Optional explicit path for the policy DB. When empty, uses `storage.local_db_path`. | +| `playbook_refine` | Boolean | `false` | Enable playbook refinement based on user progress (requires an LLM call). | +| `filesystem_sync` | Boolean | `true` | Sync policies to/from the `.cuga` folder on disk. | +| `cuga_folder` | String | `".cuga"` | Path to the `.cuga` folder used for policy files. | +| `auto_load_policies` | Boolean | `true` | Automatically load policies from the `.cuga` folder on startup. | + +--- + +## Service + +Identifies the running CUGA instance, useful for multi-tenant or Kubernetes deployments. + +```toml +[service] +instance_id = "" +tenant_id = "" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `instance_id` | String | `""` | Override with `DYNACONF_SERVICE__INSTANCE_ID` (e.g. K8s pod name, deployment id). | +| `tenant_id` | String | `""` | Multi-tenant SaaS tenant id. Override with `DYNACONF_SERVICE__TENANT_ID`. | + +--- + +## Secrets + +Selects how CUGA resolves secrets at runtime — local environment variables or HashiCorp Vault. + +```toml +[secrets] +mode = "local" +force_env = true +db_encryption_key_env = "CUGA_SECRET_KEY" +vault_addr = "" +vault_token_env = "VAULT_TOKEN" +vault_auth_method = "" +vault_k8s_role = "" +vault_k8s_mount_path = "kubernetes" +vault_k8s_jwt_path = "/var/run/secrets/kubernetes.io/serviceaccount/token" +vault_cacert = "" +vault_skip_verify = false +vault_mount = "secret" +vault_kv_version = "" +vault_secret_path = "" +vault_write_enabled = false +aws_region = "" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `mode` | String | `"local"` | `"local"` (env vars / UI overrides) or `"vault"`. | +| `force_env` | Boolean | `true` | If `true`, always resolve from `os.environ` (ignores UI overrides and Vault). | +| `db_encryption_key_env` | String | `"CUGA_SECRET_KEY"` | Environment variable holding the encryption key for stored secrets. | +| `vault_addr` | String | `""` | Vault server URL (e.g. `https://vault.example.com:8200`). | +| `vault_token_env` | String | `"VAULT_TOKEN"` | Env var name that holds the Vault token (when `vault_auth_method = "token"`). | +| `vault_auth_method` | String | `""` | `""`, `"token"`, or `"kubernetes"`. Override with `DYNACONF_SECRETS__VAULT_AUTH_METHOD`. | +| `vault_k8s_role` | String | `""` | Vault role used by Kubernetes auth. | +| `vault_k8s_mount_path` | String | `"kubernetes"` | Mount path of the Kubernetes auth backend. | +| `vault_k8s_jwt_path` | String | `/var/run/.../token` | Path to the service-account JWT inside the pod. | +| `vault_cacert` | String | `""` | Path to a PEM bundle used to verify Vault TLS (env: `VAULT_CACERT`). | +| `vault_skip_verify` | Boolean | `false` | Dev only — disable TLS verification (env: `VAULT_SKIP_VERIFY`). | +| `vault_mount` | String | `"secret"` | KV mount path within Vault. | +| `vault_kv_version` | String | `""` | `"1"` or `"2"`. Empty defaults to KV v2; use `"1"` only for KV v1 mounts. | +| `vault_secret_path` | String | `""` | Base path for secrets. Override with `DYNACONF_SECRETS__VAULT_SECRET_PATH`. | +| `vault_write_enabled` | Boolean | `false` | Allow CUGA to write secrets back to Vault (most setups should leave this off). | +| `aws_region` | String | `""` | Reserved for AWS Secrets Manager integration. | + +--- + +## Auth + +Optional OIDC/BFF authentication for the CUGA server. + +```toml +[auth] +enabled = false +authorization_enabled = false +manage_roles = ["ServiceOwner", "ServiceAdmin"] +chat_roles = ["ServiceOwner", "ServiceAdmin", "ServiceUser"] +session_cookie_name = "cuga_session" +session_max_age = 3600 +jwks_cache_ttl = 3600 +require_https = false +ssl_keyfile = "" +ssl_certfile = "" +iam_proxy_url = "" +iam_proxy_skip_verify = false +iam_proxy_ca_bundle = "" +role_token_source = "auto" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `enabled` | Boolean | `false` | Enable OIDC/BFF authentication on the demo server. | +| `authorization_enabled` | Boolean | `false` | Enforce role-based authorization in addition to authentication. | +| `manage_roles` | Array<String> | `["ServiceOwner", "ServiceAdmin"]` | Roles allowed to manage policies, tools, and configuration. | +| `chat_roles` | Array<String> | `["ServiceOwner", "ServiceAdmin", "ServiceUser"]` | Roles allowed to chat with the agent. | +| `session_cookie_name` | String | `"cuga_session"` | Name of the BFF session cookie. | +| `session_max_age` | Integer (seconds) | `3600` | Session lifetime. | +| `jwks_cache_ttl` | Integer (seconds) | `3600` | How long signed-key sets from the IdP are cached. | +| `require_https` | Boolean | `false` | Reject non-HTTPS traffic (production). | +| `ssl_keyfile` | String | `""` | Path to TLS private key (when terminating TLS in CUGA). | +| `ssl_certfile` | String | `""` | Path to TLS certificate. | +| `iam_proxy_url` | String | `""` | URL of an upstream IAM proxy in front of CUGA. | +| `iam_proxy_skip_verify` | Boolean | `false` | Skip TLS verification against the IAM proxy (dev only). | +| `iam_proxy_ca_bundle` | String | `""` | PEM bundle for IAM-proxy TLS (independent of `oidc_ca_bundle`). | +| `role_token_source` | String | `"auto"` | Where roles come from: `"auto"`, `"id_token"`, `"access_token"`, `"iam_proxy"`. | + +OIDC client/issuer/secret values are configured via environment variables — see [Environment Variable Overrides](#environment-variable-overrides). + +--- + +## UI + +Customize the demo UI branding. + +```toml +[ui] +hide_cuga_logo = false +brand_name = "CUGA Agent" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `hide_cuga_logo` | Boolean | `false` | Hide the CUGA logo in the header (e.g. when white-labelling). | +| `brand_name` | String | `"CUGA Agent"` | App name shown in the header. | + +--- + +## Context Summarization + +Automatically summarize older parts of the conversation when the context window starts to fill up. + +```toml +[context_summarization] +enabled = false +keep_last_n_messages = 10 +trim_tokens_to_summarize = 500 +summarization_model = "gpt-4o-mini" +trigger_fraction = 0.75 +# trigger_tokens = 2000 +# trigger_messages = 20 +# custom_summary_prompt = "Provide a concise summary of the conversation: {messages}" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `enabled` | Boolean | `false` | Enable intelligent context summarization. | +| `keep_last_n_messages` | Integer | `10` | Number of recent messages preserved unsummarized. | +| `trim_tokens_to_summarize` | Integer | `500` | Target token count for generated summaries. | +| `summarization_model` | String | `"gpt-4o-mini"` | Model used to generate summaries (kept fast and cheap by default). | +| `trigger_fraction` | Float | `0.75` | Trigger summarization at this fraction of the model's context window. | +| `trigger_tokens` | Integer | _(unset)_ | Alternative trigger: total tokens above this count. | +| `trigger_messages` | Integer | _(unset)_ | Alternative trigger: number of messages since the last summary. | +| `custom_summary_prompt` | String | _(unset)_ | Optional custom prompt template (uses LangChain default if not set). | + +--- + +## Connections + +Controls TLS for outbound LLM inference connections. + +```toml +[connections] +inference_ca_cert = "" +inference_disable_ssl = false +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `inference_ca_cert` | String | `""` | Path to a CA certificate for the inference HTTP clients (OpenAI and LiteLLM). Env: `CUGA_INFERENCE_CA_CERT`. | +| `inference_disable_ssl` | Boolean | `false` | Disable SSL verification for all inference connections (overrides `inference_ca_cert`). Env: `CUGA_DISABLE_SSL`. | + +--- + +## Observability + +Optional OpenLit / OpenTelemetry observability for LLM calls. + +```toml +[observability] +openlit = false +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `openlit` | Boolean | `false` | Enable OpenLit LLM observability via OpenTelemetry (OTLP). Requires `pip install cuga[observability]`. Configure the OTLP endpoint via `OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318`. | + +A local testing stack (OTel Collector + Tempo + Prometheus + Grafana) is provided under `deployment/docker-compose/openlit/` in the cuga-agent repo. + +--- + +## Evolve + +Optional integration with [altk-evolve](https://pypi.org/project/altk-evolve/) for trajectory-based learning in CugaLite. + +```toml +[evolve] +enabled = true +url = "http://127.0.0.1:8201/sse" +mode = "auto" +app_name = "evolve" +lite_mode_only = true +save_on_success = true +save_on_failure = true +async_save = true +timeout = 30.0 +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `enabled` | Boolean | `true` | Master toggle for Evolve integration. | +| `url` | String | `"http://127.0.0.1:8201/sse"` | SSE endpoint of a manually-run Evolve MCP server (used when `mode = "direct"` or as a fallback in `"auto"`). | +| `mode` | String | `"auto"` | `"auto"` = registry first then direct SSE fallback; `"registry"` = registry only; `"direct"` = direct SSE only. | +| `app_name` | String | `"evolve"` | MCP app/server name when Evolve is managed by the CUGA registry. | +| `lite_mode_only` | Boolean | `true` | Only activate Evolve for CugaLite mode. | +| `save_on_success` | Boolean | `true` | Save trajectory on successful task completion. | +| `save_on_failure` | Boolean | `true` | Save trajectory on failed task completion. | +| `async_save` | Boolean | `true` | Save trajectories in the background (non-blocking). | +| `timeout` | Float (seconds) | `30.0` | Timeout for Evolve MCP calls. | + +--- + ## Configuration Examples ### Fast Development Setup @@ -287,14 +648,12 @@ For E2B cloud sandbox, configure registry exposure: ```toml [features] cuga_mode = "fast" -memory_provider = "mem0" [advanced_features] use_vision = true code_planner_enabled = true api_planner_hitl = false lite_mode = true -enable_memory = false mode = 'api' [server_ports] @@ -307,23 +666,38 @@ registry = 8001 ```toml [features] cuga_mode = "accurate" -memory_provider = "mem0" [advanced_features] use_vision = true code_planner_enabled = true api_planner_hitl = true # Require approval for critical actions lite_mode = true -enable_memory = true # Learn from experience langfuse_tracing = true # Full observability mode = 'api' message_window_limit = 200 max_input_length = 10000 +[auth] +enabled = true +authorization_enabled = true +require_https = true + +[secrets] +mode = "vault" +vault_addr = "https://vault.example.com:8200" +vault_auth_method = "kubernetes" +vault_k8s_role = "cuga" + +[storage] +mode = "prod" +postgres_url = "postgresql+psycopg://user:pass@db:5432/cuga" + +[observability] +openlit = true # OTLP-based tracing + [server_ports] demo = 7860 registry = 8001 -memory = 8888 ``` ### E2B Cloud Execution Setup @@ -331,7 +705,6 @@ memory = 8888 ```toml [features] cuga_mode = "balanced" -memory_provider = "mem0" [advanced_features] e2b_sandbox = true @@ -351,10 +724,8 @@ function_call_host = "https://your-ngrok-url.ngrok.io" # E2B tunnel URL ```toml [features] cuga_mode = "save_reuse_fast" -memory_provider = "mem0" [advanced_features] -enable_memory = true save_reuse_generate_html = false # Disable for performance decomposition_strategy = "flexible" lite_mode = true @@ -364,7 +735,6 @@ code_planner_enabled = true demo = 7860 registry = 8001 saved_flows = 8003 -memory = 8888 ``` ### Web/Hybrid Mode Setup @@ -375,7 +745,6 @@ start_url = "https://example.com" [features] cuga_mode = "balanced" -memory_provider = "mem0" [advanced_features] mode = 'hybrid' # or 'web' for web-only @@ -434,6 +803,53 @@ All environment variables that can be used to configure CUGA: | `MAC_USER_DATA_PATH` | Chrome profile path on macOS | `~/Library/Application Support/Google/Chrome/AgentS` | | `WINDOWS_USER_DATA_PATH` | Chrome profile path on Windows | `C:/Users//AppData/Local/Google/Chrome/User Data/AgentS` | +#### Secrets & Vault + +| Variable | Description | Example | +|----------|-------------|---------| +| `CUGA_SECRET_KEY` | Encryption key for secrets stored by CUGA (matches `[secrets].db_encryption_key_env`). | | +| `VAULT_TOKEN` | Vault token (when `vault_auth_method = "token"`). | | +| `VAULT_CACERT` | Path to PEM bundle for Vault TLS. | | +| `VAULT_SKIP_VERIFY` | Disable Vault TLS verification (dev only). | `true` | +| `DYNACONF_SECRETS__VAULT_AUTH_METHOD` | Override `[secrets].vault_auth_method` at runtime. | `kubernetes` | +| `DYNACONF_SECRETS__VAULT_SECRET_PATH` | Override `[secrets].vault_secret_path`. | | + +#### Authentication (OIDC / BFF) + +| Variable | Description | +|----------|-------------| +| `OIDC_ISSUER` | OIDC issuer URL. | +| `OIDC_CLIENT_ID` | OIDC client id. | +| `OIDC_CLIENT_SECRET` | OIDC client secret. | +| `OIDC_REDIRECT_URI` | Callback URL registered with the IdP. | +| `OIDC_CA_BUNDLE` | Optional CA bundle for OIDC TLS (independent of `iam_proxy_ca_bundle`). | + +#### Service Identity + +| Variable | Description | +|----------|-------------| +| `DYNACONF_SERVICE__INSTANCE_ID` | Override `[service].instance_id` (e.g. K8s pod name). | +| `DYNACONF_SERVICE__TENANT_ID` | Override `[service].tenant_id` for multi-tenant deployments. | + +#### TLS for Inference + +| Variable | Description | +|----------|-------------| +| `CUGA_INFERENCE_CA_CERT` | CA cert for OpenAI/LiteLLM HTTP clients (overrides `[connections].inference_ca_cert`). | +| `CUGA_DISABLE_SSL` | Disable TLS verification for all inference connections (overrides `inference_disable_ssl`). | + +#### Storage + +| Variable | Description | +|----------|-------------| +| `CUGA_DBS_DIR` | Override `DBS_DIR` (default location for the local SQLite policy DB). | + +#### Observability + +| Variable | Description | Example | +|----------|-------------|---------| +| `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP collector endpoint when `[observability].openlit = true`. | `http://localhost:4318` | + #### Server Ports (Dynaconf) Use `DYNACONF_SERVER_PORTS__` to override port settings: @@ -446,7 +862,8 @@ Use `DYNACONF_SERVER_PORTS__` to override port settings: | `DYNACONF_SERVER_PORTS__EMAIL_MCP` | Email MCP server port | `8000` | | `DYNACONF_SERVER_PORTS__EMAIL_SINK` | Email SMTP sink port | `1025` | | `DYNACONF_SERVER_PORTS__FILESYSTEM_MCP` | File System MCP port | `8112` | -| `DYNACONF_SERVER_PORTS__MEMORY` | Memory service port | `8888` | +| `DYNACONF_SERVER_PORTS__DOCS_MCP` | Docs MCP port | `8113` | +| `DYNACONF_SERVER_PORTS__OAK_HEALTH_API` | Health-demo OpenAPI port | `8090` | #### Example .env File @@ -508,7 +925,7 @@ kill -9 **Optimization Strategy**: 1. Enable `lite_mode = true` for simple tasks -2. Reduce `message_window_limit` to 50 if using memory +2. Reduce `message_window_limit` to 50 to keep prompts small 3. Disable `langfuse_tracing` unless needed 4. Use `fast` mode if accuracy isn't critical 5. Enable `save_reuse_fast` mode for repetitive tasks diff --git a/content/docs/customization/storage.mdx b/content/docs/customization/storage.mdx new file mode 100644 index 0000000..61e9943 --- /dev/null +++ b/content/docs/customization/storage.mdx @@ -0,0 +1,73 @@ +--- +title: Storage Backends +description: Choose between local SQLite and production Postgres for policy and knowledge data. +--- + +import { Callout } from 'fumadocs-ui/components/callout'; + +CUGA persists three things: **policies** (vectors + metadata), **knowledge documents** (vectors + metadata + uploaded files), and **knowledge tasks/settings**. The `[storage].mode` setting selects a single backend stack used for all three. + +## Modes + +```toml +[storage] +mode = "local" # "local" | "prod" +local_db_path = "" # default DBS_DIR/cuga.db when empty +postgres_url = "" # required when mode = "prod" +``` + +| Data | `local` | `prod` | +|------|---------|--------| +| Policy vectors | sqlite-vec at `[policy].policy_db_path` or `storage.local_db_path` (default `DBS_DIR/cuga.db`); table named after `[policy].collection_name`. | `storage.postgres_url` (pgvector). | +| Knowledge vectors | `{knowledge.persist_dir}/knowledge_vectors.db` (vec0 tables per collection). | `storage.postgres_url` (same DB). | +| Knowledge metadata (tasks, documents, collection_config, settings) | `{knowledge.persist_dir}/metadata.db`. Default `persist_dir` is `/.cuga/knowledge/`. | Postgres tables `cuga_knowledge_meta_*` on `storage.postgres_url`. Uploaded **files** still live under `persist_dir/files/`. | + +`DBS_DIR` defaults to the package's `dbs/` directory, or to the value of `CUGA_DBS_DIR` if set. `persist_dir` can be overridden in `knowledge_settings.toml`. + +## Local mode (default) + +Best for development, single-user demos, and small deployments. + +```toml +[storage] +mode = "local" +``` + +No external services required. SQLite + sqlite-vec keeps everything in a single file, so you can ship a working agent with `git`-cloneable state. + +## Production mode + +Best for shared deployments, multi-replica services, or anywhere you need transactional guarantees and proper backups. + +```toml +[storage] +mode = "prod" +postgres_url = "postgresql+psycopg://cuga:secret@db.internal:5432/cuga" +``` + +Postgres must have the **pgvector** extension enabled: + +```sql +CREATE EXTENSION IF NOT EXISTS vector; +``` + +CUGA creates the policy and knowledge tables on first startup. Uploaded knowledge files (the originals, not the parsed chunks) continue to live under `persist_dir/files/` — mount that directory on persistent storage in container deployments. + +## Embeddings + +Embeddings live in `[storage.embedding]` and are independent of the backend mode: + +```toml +[storage.embedding] +provider = "local" # "openai" | "local" | "auto" +model = "BAAI/bge-small-en-v1.5" +dim = 384 # 1536 for OpenAI, 384 for the BAAI model +base_url = "" # optional OpenAI-compatible endpoint +api_key = "" # falls back to OPENAI_API_KEY +``` + +`provider = "auto"` tries OpenAI and falls back to the local model if no API key is configured — handy when the same `settings.toml` ships across dev and prod. + + +Switching from `local` to `prod` does **not** migrate existing data. If you've been running with policies or knowledge documents in SQLite, export and re-import them in the new backend. + diff --git a/content/docs/customization/ui-branding.mdx b/content/docs/customization/ui-branding.mdx new file mode 100644 index 0000000..0d1459d --- /dev/null +++ b/content/docs/customization/ui-branding.mdx @@ -0,0 +1,33 @@ +--- +title: UI Branding +description: White-label the CUGA demo UI — hide the logo and change the displayed app name. +--- + +The demo UI exposes two simple branding hooks under the `[ui]` section of `settings.toml`. Use them when CUGA is embedded in a customer-facing product, evaluation environment, or internal tool that should carry your own branding. + +## Settings + +```toml +[ui] +hide_cuga_logo = false +brand_name = "CUGA Agent" +``` + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `hide_cuga_logo` | Boolean | `false` | Hide the CUGA logo in the demo header. | +| `brand_name` | String | `"CUGA Agent"` | App name shown in the header. | + +Both options take effect on the next demo restart (`cuga start demo`, `cuga start demo_crm`, etc.). + +## Example + +```toml +[ui] +hide_cuga_logo = true +brand_name = "Acme Assistant" +``` + +This produces a header with no CUGA mark and "Acme Assistant" in place of the default product name. + +For deeper UI customization (custom logos, themes, or full white-label builds), build the demo frontend from source — see the cuga-agent repo for the build instructions. diff --git a/content/docs/getting-started/index.mdx b/content/docs/getting-started/index.mdx index f568db0..54cc917 100644 --- a/content/docs/getting-started/index.mdx +++ b/content/docs/getting-started/index.mdx @@ -117,7 +117,7 @@ CUGA is still early, but already provides useful building blocks: - **Python 3.12**: Core runtime environment - **UV**: Modern Python package management - **FastAPI**: High-performance web framework - - **Selenium/Playwright**: Browser automation capabilities + - **Playwright**: Browser automation (with a Chromium-based extension for web/hybrid modes) - **OpenAI/LiteLLM**: LLM integration for intelligent decision making - **Docker**: Containerized deployment and evaluation diff --git a/content/docs/sdk/cuga_agent.mdx b/content/docs/sdk/cuga_agent.mdx index 3575a4c..733900e 100644 --- a/content/docs/sdk/cuga_agent.mdx +++ b/content/docs/sdk/cuga_agent.mdx @@ -61,7 +61,7 @@ agent = CugaAgent( @@ -269,6 +274,34 @@ await compiled_graph.ainvoke(...) Access the `PoliciesManager`. See [Policies](../policies). +### `knowledge` + +Access the `KnowledgeManager` when the agent is constructed with `enable_knowledge=True` (the default). + +```python +await agent.knowledge.ingest("/path/to/quarterly_report.pdf") +results = await agent.knowledge.search("Q4 revenue figures") +docs = await agent.knowledge.list_documents() +``` + +Both `ingest` and `search` accept a `scope` argument (`"agent"` for permanent, shared documents — the default — or `"session"` for thread-scoped documents that require a `thread_id`). See the [Knowledge Base guide](/docs/customization/knowledge) for the full surface, supported document types, and storage details. + +## Resource Cleanup + +### `aclose` + +Release async resources held by the agent (DB connections, background tasks, etc.). Call this at the end of long-running scripts or before process exit: + +```python +agent = CugaAgent(enable_knowledge=True) +try: + result = await agent.invoke("...") +finally: + await agent.aclose() +``` + +Short-lived scripts can rely on garbage collection, but `aclose` is recommended any time `enable_knowledge=True` is used. + ## Tool Call Tracking CUGA provides built-in tool call tracking to help with debugging, observability, and auditing. When enabled, every tool invocation is recorded with detailed metadata. diff --git a/content/docs/sdk/cuga_supervisor.mdx b/content/docs/sdk/cuga_supervisor.mdx index a5c696f..1a0bb67 100644 --- a/content/docs/sdk/cuga_supervisor.mdx +++ b/content/docs/sdk/cuga_supervisor.mdx @@ -10,6 +10,16 @@ import { TypeTable } from 'fumadocs-ui/components/type-table'; The `CugaSupervisor` class coordinates multiple agents: it receives a user task, delegates work to specialized sub-agents, and returns a final answer. You can mix local `CugaAgent` instances with remote agents via the **A2A** protocol. +## Try the Demo + +The bundled CRM + email multi-agent demo can be launched with: + +```bash +cuga start demo_supervisor +``` + +This brings up the same demo surface as `demo_crm` but with the supervisor wired to a CRM sub-agent and an email sub-agent. Use it to see delegation and variable-passing end to end before building your own configuration. + ## Quick Start