Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions content/docs/architecture-section/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,3 @@ The architecture follows a modular, graph-like structure that ensures task relia
* **Human Oversight** – Critical decisions require human validation to prevent errors.
* **State Recovery** – System can resume from any point if interrupted.
* **Performance Monitoring** – Real-time metrics ensure optimal execution across web and API environments.

---

👉 Next step could be to include an **inline Mermaid diagram** inside the README, so that the architecture is rendered directly on GitHub instead of just in the SVG.

Want me to add that Mermaid diagram block so the README is fully self-contained?
86 changes: 86 additions & 0 deletions content/docs/customization/authentication.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
---
title: Authentication & Authorization
description: Optional OIDC/BFF authentication and role-based authorization for the CUGA server.
---

import { Callout } from 'fumadocs-ui/components/callout';

CUGA's demo server is unauthenticated by default. For shared or multi-user deployments, you can enable OpenID Connect (OIDC) authentication using a Backend-for-Frontend (BFF) session cookie, optionally combined with role-based authorization.

The full option list lives in the [Settings reference — Auth section](/docs/customization/settings-reference#auth).

## Quick enable

```toml
[auth]
enabled = true
authorization_enabled = true
manage_roles = ["ServiceOwner", "ServiceAdmin"]
chat_roles = ["ServiceOwner", "ServiceAdmin", "ServiceUser"]
session_cookie_name = "cuga_session"
session_max_age = 3600
require_https = true
```

Then provide the OIDC client details via environment variables (none of them belong in `settings.toml`):

```bash
export OIDC_ISSUER="https://issuer.example.com"
export OIDC_CLIENT_ID="cuga"
export OIDC_CLIENT_SECRET="..."
export OIDC_REDIRECT_URI="https://cuga.example.com/auth/callback"
```

## Authentication vs authorization

| Setting | Effect |
|---------|--------|
| `enabled = true` | Users must log in via the IdP. Anonymous traffic is rejected. |
| `authorization_enabled = true` | Roles in `manage_roles` / `chat_roles` are enforced for protected endpoints. |
| `enabled = true`, `authorization_enabled = false` | Authenticated users can use the agent regardless of role. |

### Where roles come from

`role_token_source` controls which token CUGA inspects for the user's roles claim:

| Value | Used when |
|-------|-----------|
| `"auto"` (default) | CUGA inspects the access token first, then falls back to the id_token, then the IAM proxy header. |
| `"id_token"` | Force roles to come from the OIDC id_token. |
| `"access_token"` | Force roles to come from the OIDC access token. |
| `"iam_proxy"` | Trust an upstream IAM proxy header (for deployments fronted by IBM Cloud / OpenShift IAM). |

## Behind an IAM proxy

```toml
[auth]
enabled = true
authorization_enabled = true
iam_proxy_url = "https://iam-proxy.internal"
iam_proxy_skip_verify = false
iam_proxy_ca_bundle = "/etc/cuga/iam-proxy-ca.pem"
role_token_source = "iam_proxy"
```

`iam_proxy_ca_bundle` and `OIDC_CA_BUNDLE` are independent — set both if your proxy and IdP use different internal CAs.

## TLS termination

When CUGA terminates TLS itself (i.e. there's no reverse proxy):

```toml
[auth]
require_https = true
ssl_keyfile = "/etc/cuga/tls/key.pem"
ssl_certfile = "/etc/cuga/tls/cert.pem"
```

In Kubernetes / Ingress / OpenShift Route deployments leave these empty and let the platform handle TLS.

## Optional: profile-token authorization workflow

Combined with the [authorization workflow](https://github.com/cuga-project/cuga-agent) (cuga-agent PRs #60 and #92), authenticated users can opt-in to attach their own profile token to outbound tool calls. This lets the agent act _as_ the user when calling APIs that require user-level credentials, while still gating which tools are reachable via `manage_roles` / `chat_roles`.

<Callout type="warning">
Always set `require_https = true` (or terminate TLS upstream) when authentication is on — the BFF session cookie must never travel over plaintext.
</Callout>
62 changes: 62 additions & 0 deletions content/docs/customization/context-summarization.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: Context Summarization
description: Automatically summarize older messages when the context window fills up — for both CugaAgent and CugaSupervisor.
---

import { Callout } from 'fumadocs-ui/components/callout';

For long conversations, CUGA can roll older turns into a running summary so the LLM keeps the most useful context without blowing the window.

The full option list lives in the [Settings reference — Context Summarization](/docs/customization/settings-reference#context-summarization).

## Enable

```toml
[context_summarization]
enabled = true
keep_last_n_messages = 10
trim_tokens_to_summarize = 500
summarization_model = "gpt-4o-mini"
trigger_fraction = 0.75
```

With this configuration:

- Summarization fires when the prompt would exceed **75 %** of the model's context window.
- The **last 10 messages** are always preserved verbatim.
- Older messages are condensed into ~**500 tokens** by `gpt-4o-mini`.

## Trigger options

You can use any combination of the three trigger conditions; whichever fires first wins.

| Trigger | Use when |
|---------|----------|
| `trigger_fraction = 0.75` | You want the trigger to track the model's actual context window — recommended for production. |
| `trigger_tokens = 2000` | You want a fixed token cap regardless of model. |
| `trigger_messages = 20` | You want to summarize after a fixed number of turns (useful for testing). |

If you set more than one, the **first** condition that becomes true triggers summarization.

## Custom prompt

By default LangChain's built-in summarization prompt is used. To override:

```toml
[context_summarization]
custom_summary_prompt = "Provide a concise summary of the following conversation, preserving all numeric values and named entities: {messages}"
```

The `{messages}` placeholder is the only required variable.

## Choice of summarization model

`summarization_model` is independent of the agent's main model. Most users keep it on a small/cheap model (`gpt-4o-mini`, `claude-haiku`, etc.) — the goal is fast, lossy compression, not high reasoning.

## Works with CugaSupervisor

Context summarization applies to both `CugaAgent` and `CugaSupervisor` runs. Each delegated sub-agent invocation gets the summarized history just like a standalone agent.

<Callout type="info">
Summarization is lossy by design. If your task depends on remembering every literal detail (e.g. exact figures from a document), prefer the [Knowledge Base](/docs/customization/knowledge) — it keeps the original document available for retrieval.
</Callout>
125 changes: 125 additions & 0 deletions content/docs/customization/evolve.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
---
title: Evolve Integration
description: Bring task-specific guidelines into CugaLite from altk-evolve, and save trajectories back after every run.
---

import { Callout } from 'fumadocs-ui/components/callout';
import { Tab, Tabs } from 'fumadocs-ui/components/tabs';

[altk-evolve](https://pypi.org/project/altk-evolve/) is an Anthropic-style "tip generation" service that learns guidelines from past trajectories and surfaces them at the start of similar future tasks. CUGA can use Evolve in **CugaLite mode** to:

- Inject task-specific guidelines into the system prompt before execution.
- Save the user/assistant trajectory after the run so future tasks benefit from what worked (or failed).

The full settings list is in the [Settings reference — Evolve](/docs/customization/settings-reference#evolve).

## How Evolve runs

You have two options for how the Evolve MCP server starts:

<Tabs items={['Registry-managed (recommended)', 'Standalone SSE server']}>
<Tab value="Registry-managed (recommended)">
Let the CUGA MCP registry launch Evolve for you. In the Manager UI, add an MCP tool with:

- **Name**: `evolve`
- **Connection type**: `Command (stdio)`
- **Command**: `uvx`
- **Args**: `--from altk-evolve --with setuptools<70 evolve-mcp`

Add these env values in the same MCP tool UI:

```bash
EVOLVE_BACKEND=postgres
EVOLVE_PG_HOST=localhost
EVOLVE_PG_PORT=5432
EVOLVE_PG_USER=postgres
EVOLVE_PG_PASSWORD=postgres
EVOLVE_PG_DBNAME=evolve
EVOLVE_MODEL_NAME=Azure/gpt-4o
OPENAI_API_KEY=env://OPENAI_API_KEY
OPENAI_BASE_URL=env://OPENAI_BASE_URL
```

The `env://VAR` placeholders tell CUGA to read the actual values from its own environment at runtime.

In `settings.toml`, leave `mode = "auto"` (or set `mode = "registry"`) and set `app_name = "evolve"`.
</Tab>
<Tab value="Standalone SSE server">
Run Evolve yourself as an SSE server (useful for debugging):

```bash
# From a checkout of altk-evolve:
uv sync --extra pgvector
evolve-mcp --transport sse --port 8201
```

In `settings.toml`:

```toml
[evolve]
enabled = true
url = "http://127.0.0.1:8201/sse"
mode = "direct"
```

`mode = "direct"` skips registry lookup entirely.
</Tab>
</Tabs>

## Enable in `settings.toml`

```toml
[advanced_features]
lite_mode = true # Evolve only runs for CugaLite

[evolve]
enabled = true
url = "http://127.0.0.1:8201/sse"
mode = "auto"
app_name = "evolve"
lite_mode_only = true
save_on_success = true
save_on_failure = true
async_save = true
timeout = 30.0
```

## Try it

```bash
cuga start demo_crm --sample-memory-data
```

Then run a CugaLite task, e.g.:

```
Identify the common cities between my cuga_workspace/cities.txt and cuga_workspace/company.txt
```

## What happens during a run

1. CUGA derives a task description from the current sub-task (or the first user message).
2. CugaLite asks Evolve for relevant guidelines.
3. Returned guidelines are appended to the system prompt under an `Evolve Guidelines` section.
4. The task executes normally.
5. The user/assistant trajectory is saved back to Evolve after completion.

## Tuning

| Setting | Effect |
|---------|--------|
| `async_save = true` | Save in the background; doesn't block the response. |
| `save_on_success` | Only persist successful runs. |
| `save_on_failure` | Only persist failed runs. |
| `mode = "auto"` | Try registry first, fall back to direct SSE. |
| `mode = "registry"` | Force registry-managed Evolve. |
| `mode = "direct"` | Skip registry lookup; use `url`. |
| `lite_mode_only = true` | Disable Evolve for non-lite paths. |

<Callout type="info">
If Evolve is unavailable, times out, or returns no guidance, CUGA continues normally — Evolve never blocks task execution.
</Callout>

<Callout type="warning">
If you use Evolve's tip generation, make sure the Evolve MCP server's environment includes the required model settings (e.g. `EVOLVE_MODEL_NAME`, OpenAI/LiteLLM credentials). Otherwise `save_trajectory` may fail later with a model-access error even if the MCP connection itself works.
</Callout>
Loading