cuga-project · haroldship · Feb 22, 2026 · May 11, 2026
diff --git a/content/docs/architecture-section/overview.mdx b/content/docs/architecture-section/overview.mdx
@@ -76,9 +76,3 @@ The architecture follows a modular, graph-like structure that ensures task relia
 * **Human Oversight** – Critical decisions require human validation to prevent errors.
 * **State Recovery** – System can resume from any point if interrupted.
 * **Performance Monitoring** – Real-time metrics ensure optimal execution across web and API environments.
-
----
-
-👉 Next step could be to include an **inline Mermaid diagram** inside the README, so that the architecture is rendered directly on GitHub instead of just in the SVG.
-
-Want me to add that Mermaid diagram block so the README is fully self-contained?
diff --git a/content/docs/customization/authentication.mdx b/content/docs/customization/authentication.mdx
@@ -0,0 +1,86 @@
+---
+title: Authentication & Authorization
+description: Optional OIDC/BFF authentication and role-based authorization for the CUGA server.
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+CUGA's demo server is unauthenticated by default. For shared or multi-user deployments, you can enable OpenID Connect (OIDC) authentication using a Backend-for-Frontend (BFF) session cookie, optionally combined with role-based authorization.
+
+The full option list lives in the [Settings reference — Auth section](/docs/customization/settings-reference#auth).
+
+## Quick enable
+
+```toml
+[auth]
+enabled = true
+authorization_enabled = true
+manage_roles = ["ServiceOwner", "ServiceAdmin"]
+chat_roles = ["ServiceOwner", "ServiceAdmin", "ServiceUser"]
+session_cookie_name = "cuga_session"
+session_max_age = 3600
+require_https = true
+```
+
+Then provide the OIDC client details via environment variables (none of them belong in `settings.toml`):
+
+```bash
+export OIDC_ISSUER="https://issuer.example.com"
+export OIDC_CLIENT_ID="cuga"
+export OIDC_CLIENT_SECRET="..."
+export OIDC_REDIRECT_URI="https://cuga.example.com/auth/callback"
+```
+
+## Authentication vs authorization
+
+| Setting | Effect |
+|---------|--------|
+| `enabled = true` | Users must log in via the IdP. Anonymous traffic is rejected. |
+| `authorization_enabled = true` | Roles in `manage_roles` / `chat_roles` are enforced for protected endpoints. |
+| `enabled = true`, `authorization_enabled = false` | Authenticated users can use the agent regardless of role. |
+
+### Where roles come from
+
+`role_token_source` controls which token CUGA inspects for the user's roles claim:
+
+| Value | Used when |
+|-------|-----------|
+| `"auto"` (default) | CUGA inspects the access token first, then falls back to the id_token, then the IAM proxy header. |
+| `"id_token"` | Force roles to come from the OIDC id_token. |
+| `"access_token"` | Force roles to come from the OIDC access token. |
+| `"iam_proxy"` | Trust an upstream IAM proxy header (for deployments fronted by IBM Cloud / OpenShift IAM). |
+
+## Behind an IAM proxy
+
+```toml
+[auth]
+enabled = true
+authorization_enabled = true
+iam_proxy_url = "https://iam-proxy.internal"
+iam_proxy_skip_verify = false
+iam_proxy_ca_bundle = "/etc/cuga/iam-proxy-ca.pem"
+role_token_source = "iam_proxy"
+```
+
+`iam_proxy_ca_bundle` and `OIDC_CA_BUNDLE` are independent — set both if your proxy and IdP use different internal CAs.
+
+## TLS termination
+
+When CUGA terminates TLS itself (i.e. there's no reverse proxy):
+
+```toml
+[auth]
+require_https = true
+ssl_keyfile = "/etc/cuga/tls/key.pem"
+ssl_certfile = "/etc/cuga/tls/cert.pem"
+```
+
+In Kubernetes / Ingress / OpenShift Route deployments leave these empty and let the platform handle TLS.
+
+## Optional: profile-token authorization workflow
+
+Combined with the [authorization workflow](https://github.com/cuga-project/cuga-agent) (cuga-agent PRs #60 and #92), authenticated users can opt-in to attach their own profile token to outbound tool calls. This lets the agent act _as_ the user when calling APIs that require user-level credentials, while still gating which tools are reachable via `manage_roles` / `chat_roles`.
+
+<Callout type="warning">
+Always set `require_https = true` (or terminate TLS upstream) when authentication is on — the BFF session cookie must never travel over plaintext.
+</Callout>
diff --git a/content/docs/customization/context-summarization.mdx b/content/docs/customization/context-summarization.mdx
@@ -0,0 +1,62 @@
+---
+title: Context Summarization
+description: Automatically summarize older messages when the context window fills up — for both CugaAgent and CugaSupervisor.
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+
+For long conversations, CUGA can roll older turns into a running summary so the LLM keeps the most useful context without blowing the window.
+
+The full option list lives in the [Settings reference — Context Summarization](/docs/customization/settings-reference#context-summarization).
+
+## Enable
+
+```toml
+[context_summarization]
+enabled = true
+keep_last_n_messages = 10
+trim_tokens_to_summarize = 500
+summarization_model = "gpt-4o-mini"
+trigger_fraction = 0.75
+```
+
+With this configuration:
+
+- Summarization fires when the prompt would exceed **75 %** of the model's context window.
+- The **last 10 messages** are always preserved verbatim.
+- Older messages are condensed into ~**500 tokens** by `gpt-4o-mini`.
+
+## Trigger options
+
+You can use any combination of the three trigger conditions; whichever fires first wins.
+
+| Trigger | Use when |
+|---------|----------|
+| `trigger_fraction = 0.75` | You want the trigger to track the model's actual context window — recommended for production. |
+| `trigger_tokens = 2000` | You want a fixed token cap regardless of model. |
+| `trigger_messages = 20` | You want to summarize after a fixed number of turns (useful for testing). |
+
+If you set more than one, the **first** condition that becomes true triggers summarization.
+
+## Custom prompt
+
+By default LangChain's built-in summarization prompt is used. To override:
+
+```toml
+[context_summarization]
+custom_summary_prompt = "Provide a concise summary of the following conversation, preserving all numeric values and named entities: {messages}"
+```
+
+The `{messages}` placeholder is the only required variable.
+
+## Choice of summarization model
+
+`summarization_model` is independent of the agent's main model. Most users keep it on a small/cheap model (`gpt-4o-mini`, `claude-haiku`, etc.) — the goal is fast, lossy compression, not high reasoning.
+
+## Works with CugaSupervisor
+
+Context summarization applies to both `CugaAgent` and `CugaSupervisor` runs. Each delegated sub-agent invocation gets the summarized history just like a standalone agent.
+
+<Callout type="info">
+Summarization is lossy by design. If your task depends on remembering every literal detail (e.g. exact figures from a document), prefer the [Knowledge Base](/docs/customization/knowledge) — it keeps the original document available for retrieval.
+</Callout>
diff --git a/content/docs/customization/evolve.mdx b/content/docs/customization/evolve.mdx
@@ -0,0 +1,125 @@
+---
+title: Evolve Integration
+description: Bring task-specific guidelines into CugaLite from altk-evolve, and save trajectories back after every run.
+---
+
+import { Callout } from 'fumadocs-ui/components/callout';
+import { Tab, Tabs } from 'fumadocs-ui/components/tabs';
+
+[altk-evolve](https://pypi.org/project/altk-evolve/) is an Anthropic-style "tip generation" service that learns guidelines from past trajectories and surfaces them at the start of similar future tasks. CUGA can use Evolve in **CugaLite mode** to:
+
+- Inject task-specific guidelines into the system prompt before execution.
+- Save the user/assistant trajectory after the run so future tasks benefit from what worked (or failed).
+
+The full settings list is in the [Settings reference — Evolve](/docs/customization/settings-reference#evolve).
+
+## How Evolve runs
+
+You have two options for how the Evolve MCP server starts:
+
+<Tabs items={['Registry-managed (recommended)', 'Standalone SSE server']}>
+<Tab value="Registry-managed (recommended)">
+Let the CUGA MCP registry launch Evolve for you. In the Manager UI, add an MCP tool with:
+
+- **Name**: `evolve`
+- **Connection type**: `Command (stdio)`
+- **Command**: `uvx`
+- **Args**: `--from altk-evolve --with setuptools<70 evolve-mcp`
+
+Add these env values in the same MCP tool UI:
+
+```bash
+EVOLVE_BACKEND=postgres
+EVOLVE_PG_HOST=localhost
+EVOLVE_PG_PORT=5432
+EVOLVE_PG_USER=postgres
+EVOLVE_PG_PASSWORD=postgres
+EVOLVE_PG_DBNAME=evolve
+EVOLVE_MODEL_NAME=Azure/gpt-4o
+OPENAI_API_KEY=env://OPENAI_API_KEY
+OPENAI_BASE_URL=env://OPENAI_BASE_URL
+```
+
+The `env://VAR` placeholders tell CUGA to read the actual values from its own environment at runtime.
+
+In `settings.toml`, leave `mode = "auto"` (or set `mode = "registry"`) and set `app_name = "evolve"`.
+</Tab>
+<Tab value="Standalone SSE server">
+Run Evolve yourself as an SSE server (useful for debugging):
+
+```bash
+# From a checkout of altk-evolve:
+uv sync --extra pgvector
+evolve-mcp --transport sse --port 8201
+```
+
+In `settings.toml`:
+
+```toml
+[evolve]
+enabled = true
+url = "http://127.0.0.1:8201/sse"
+mode = "direct"
+```
+
+`mode = "direct"` skips registry lookup entirely.
+</Tab>
+</Tabs>
+
+## Enable in `settings.toml`
+
+```toml
+[advanced_features]
+lite_mode = true     # Evolve only runs for CugaLite
+
+[evolve]
+enabled = true
+url = "http://127.0.0.1:8201/sse"
+mode = "auto"
+app_name = "evolve"
+lite_mode_only = true
+save_on_success = true
+save_on_failure = true
+async_save = true
+timeout = 30.0
+```
+
+## Try it
+
+```bash
+cuga start demo_crm --sample-memory-data
+```
+
+Then run a CugaLite task, e.g.:
+
+```
+Identify the common cities between my cuga_workspace/cities.txt and cuga_workspace/company.txt
+```
+
+## What happens during a run
+
+1. CUGA derives a task description from the current sub-task (or the first user message).
+2. CugaLite asks Evolve for relevant guidelines.
+3. Returned guidelines are appended to the system prompt under an `Evolve Guidelines` section.
+4. The task executes normally.
+5. The user/assistant trajectory is saved back to Evolve after completion.
+
+## Tuning
+
+| Setting | Effect |
+|---------|--------|
+| `async_save = true` | Save in the background; doesn't block the response. |
+| `save_on_success` | Only persist successful runs. |
+| `save_on_failure` | Only persist failed runs. |
+| `mode = "auto"` | Try registry first, fall back to direct SSE. |
+| `mode = "registry"` | Force registry-managed Evolve. |
+| `mode = "direct"` | Skip registry lookup; use `url`. |
+| `lite_mode_only = true` | Disable Evolve for non-lite paths. |
+
+<Callout type="info">
+If Evolve is unavailable, times out, or returns no guidance, CUGA continues normally — Evolve never blocks task execution.
+</Callout>
+
+<Callout type="warning">
+If you use Evolve's tip generation, make sure the Evolve MCP server's environment includes the required model settings (e.g. `EVOLVE_MODEL_NAME`, OpenAI/LiteLLM credentials). Otherwise `save_trajectory` may fail later with a model-access error even if the MCP connection itself works.
+</Callout>