Skip to content
This repository was archived by the owner on Feb 26, 2026. It is now read-only.
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 39 additions & 27 deletions docs/guardrails.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,12 @@ applied around Large Language Model (LLM) inference.
Their purpose is to ensure that LLM usage is **safe, auditable, and predictable**
in enterprise environments.

Guardrails are not about changing how models think — they are about controlling
**how models are accessed and used**.
Guardrails do not change how models reason or generate outputs — they control
**how models are accessed, executed, and isolated** within the platform.

In Cube AI, a *domain* represents an isolated workspace that groups users,
permissions, configuration, and available models. Guardrails are applied
**per domain**, ensuring strong isolation and access control between different
workspaces.
**per domain**, ensuring strong isolation and access control between workspaces.

---

Expand All @@ -31,53 +30,64 @@ Cube AI does **not**:

- train models
- fine-tune models
- rewrite prompts
- alter model outputs for ethical or policy reasons

---

## What Guardrails Do

![Cube AI guardrails overview](/img/cube-ai-guardrails.png)
This diagram shows how Cube AI guardrails enforce security and isolation at the
platform level without interfering with application logic or model behavior.

This diagram illustrates how Cube AI guardrails enforce security and isolation
at the platform level without interfering with application logic or model behavior.

Cube AI guardrails provide:

### Authentication & Authorization

- token-based access control
- domain-scoped permissions (workspace-level access control)
- token-based access control (PATs and API tokens)
- domain-scoped permissions
- per-domain model visibility
- enforcement of role-based access control (RBAC)

### Domain Isolation

- strict separation between domains
- no data or model leakage across domains
- isolated execution contexts
- domain-scoped configuration and model exposure

### Request Validation

- validation of incoming requests
- validation of incoming API requests
- enforcement of API contracts
- rejection of malformed or unauthorized calls

### Model Access Control

- control which models are available per domain
- backend-specific model exposure
- control over which models are available per domain
- backend-specific model exposure (e.g., vLLM, Ollama)
- prevention of unauthorized model usage

### Secure Execution (TEE)

- all inference runs inside **Trusted Execution Environments**
- prompts and outputs remain confidential
- memory isolation during execution
When configured, model inference can execute inside
**Trusted Execution Environments (TEEs)**.

This provides:

- confidential execution of prompts and responses
- runtime memory isolation
- attested secure connections between platform components
- verifiable execution integrity

### Auditing & Observability (Optional)
### Auditing & Observability

- request metadata logging
- recording of security-relevant events
- traceability of inference calls
- integration with audit pipelines when enabled
- integration with audit logs when enabled
- support for compliance and forensic analysis

---

Expand All @@ -87,11 +97,11 @@ To avoid confusion, Cube AI guardrails do **not**:

- rewrite or sanitize prompts
- filter or censor model outputs
- implement AI ethics policies
- perform content moderation or alignment
- implement AI ethics or content moderation policies
- perform alignment tuning
- replace application-level safety logic

Guardrails ensure **platform safety**, not application behavior.
Guardrails ensure **platform-level safety and isolation**, not application behavior.

---

Expand All @@ -102,34 +112,36 @@ Without guardrails, LLM deployments risk:
- unauthorized access
- data leakage between tenants
- untraceable model usage
- unpredictable behavior in production
- insecure execution environments
- unpredictable production behavior

Cube AI guardrails make LLM usage suitable for:

- enterprise deployments
- multi-tenant environments
- regulated industries
- confidential workloads
- confidential and sensitive workloads

---

## Relationship to Applications

Guardrails complement — but do not replace — application-level controls.

Applications are still responsible for:
Applications remain responsible for:

- prompt design
- output validation
- business logic enforcement
- user-facing safety mechanisms

Cube AI ensures the **infrastructure layer is secure and controlled**.
Cube AI ensures the **infrastructure layer remains secure, isolated, and auditable**.

---

## Next Steps

- Learn how Cube AI executes models using **vLLM**
- Explore **Models** to understand backend configuration
- Use **Chat Completions** with guardrails enabled
- Learn how Cube AI executes models using [vLLM](./vllm)
- Explore available [Models](./api/models)
- Use [Chat Completions](./api/chat-completions) with guardrails enabled
- Review [Audit Logs](./security/audit-logs) for execution traceability