diff --git a/docs/guardrails.md b/docs/guardrails.md index 72ec6f2..84decf6 100644 --- a/docs/guardrails.md +++ b/docs/guardrails.md @@ -9,13 +9,12 @@ applied around Large Language Model (LLM) inference. Their purpose is to ensure that LLM usage is **safe, auditable, and predictable** in enterprise environments. -Guardrails are not about changing how models think — they are about controlling -**how models are accessed and used**. +Guardrails do not change how models reason or generate outputs — they control +**how models are accessed, executed, and isolated** within the platform. In Cube AI, a *domain* represents an isolated workspace that groups users, permissions, configuration, and available models. Guardrails are applied -**per domain**, ensuring strong isolation and access control between different -workspaces. +**per domain**, ensuring strong isolation and access control between workspaces. --- @@ -31,6 +30,7 @@ Cube AI does **not**: - train models - fine-tune models +- rewrite prompts - alter model outputs for ethical or policy reasons --- @@ -38,46 +38,56 @@ Cube AI does **not**: ## What Guardrails Do ![Cube AI guardrails overview](/img/cube-ai-guardrails.png) -This diagram shows how Cube AI guardrails enforce security and isolation at the -platform level without interfering with application logic or model behavior. + +This diagram illustrates how Cube AI guardrails enforce security and isolation +at the platform level without interfering with application logic or model behavior. Cube AI guardrails provide: ### Authentication & Authorization -- token-based access control -- domain-scoped permissions (workspace-level access control) +- token-based access control (PATs and API tokens) +- domain-scoped permissions - per-domain model visibility +- enforcement of role-based access control (RBAC) ### Domain Isolation - strict separation between domains - no data or model leakage across domains - isolated execution contexts +- domain-scoped configuration and model exposure ### Request Validation -- validation of incoming requests +- validation of incoming API requests - enforcement of API contracts - rejection of malformed or unauthorized calls ### Model Access Control -- control which models are available per domain -- backend-specific model exposure +- control over which models are available per domain +- backend-specific model exposure (e.g., vLLM, Ollama) - prevention of unauthorized model usage ### Secure Execution (TEE) -- all inference runs inside **Trusted Execution Environments** -- prompts and outputs remain confidential -- memory isolation during execution +When configured, model inference can execute inside +**Trusted Execution Environments (TEEs)**. + +This provides: + +- confidential execution of prompts and responses +- runtime memory isolation +- attested secure connections between platform components +- verifiable execution integrity -### Auditing & Observability (Optional) +### Auditing & Observability -- request metadata logging +- recording of security-relevant events - traceability of inference calls -- integration with audit pipelines when enabled +- integration with audit logs when enabled +- support for compliance and forensic analysis --- @@ -87,11 +97,11 @@ To avoid confusion, Cube AI guardrails do **not**: - rewrite or sanitize prompts - filter or censor model outputs -- implement AI ethics policies -- perform content moderation or alignment +- implement AI ethics or content moderation policies +- perform alignment tuning - replace application-level safety logic -Guardrails ensure **platform safety**, not application behavior. +Guardrails ensure **platform-level safety and isolation**, not application behavior. --- @@ -102,14 +112,15 @@ Without guardrails, LLM deployments risk: - unauthorized access - data leakage between tenants - untraceable model usage -- unpredictable behavior in production +- insecure execution environments +- unpredictable production behavior Cube AI guardrails make LLM usage suitable for: - enterprise deployments - multi-tenant environments - regulated industries -- confidential workloads +- confidential and sensitive workloads --- @@ -117,19 +128,20 @@ Cube AI guardrails make LLM usage suitable for: Guardrails complement — but do not replace — application-level controls. -Applications are still responsible for: +Applications remain responsible for: - prompt design - output validation - business logic enforcement - user-facing safety mechanisms -Cube AI ensures the **infrastructure layer is secure and controlled**. +Cube AI ensures the **infrastructure layer remains secure, isolated, and auditable**. --- ## Next Steps -- Learn how Cube AI executes models using **vLLM** -- Explore **Models** to understand backend configuration -- Use **Chat Completions** with guardrails enabled +- Learn how Cube AI executes models using [vLLM](./vllm) +- Explore available [Models](./api/models) +- Use [Chat Completions](./api/chat-completions) with guardrails enabled +- Review [Audit Logs](./security/audit-logs) for execution traceability