This guide explains how to create the environment for running the Llama Stack Demo workshop. For full installation details, see README.md.
- OpenShift 4.20+ (tested on 4.20)
- Red Hat OpenShift AI 3.2+ (includes Llama Stack Operator, OpenShift Service Mesh, OpenShift Serverless)
You must have cluster-admin privileges to install the operators required by Red Hat OpenShift AI. According to Red Hat OpenShift AI 3.2 documentation, cluster administrator access is required to install Red Hat OpenShift AI and manage its components.
Use workshop-setup.sh to create the full workshop environment: users (via htpasswd file), projects, group permissions, and node assignments.
- Generates htpasswd file — Always runs
setup-htpasswd-oauth.shin dry-run mode. Writes the htpasswd file (default:htpasswd.workshop) and prints instructions for the Administrator to apply it manually to the cluster OAuth. - Creates projects —
llama-stack-demo-user1,llama-stack-demo-user2, ... with labelsmodelmesh-enabled=falseandopendatahub.io/dashboard=true. - Creates group and permissions — Creates group
workshopwith usersuser1..userN, grants each user admin and ServiceMonitor access on their project (idempotent; safe to re-run). - Sets up monitoring and cluster resources — Runs
setup-monitoring.sh(Tempo, OTel, DSCI),setup-hardware-profile.sh(HardwareProfile in redhat-ods-applications),setup-rbac.sh(configmap-patcher ClusterRole/Role per namespace), andsetup-grafana-proxy-rbac.sh(Grafana proxy RBAC per namespace). This provisions the telemetry layer (metrics/traces) only. Grafana dashboards are optional and off by default (monitoring.enable: false) since they need the community Grafana Operator; opt in per deploy with--set monitoring.enable=true. (Documented forward path: Perses via the Cluster Observability Operator — planned separately.) - Assigns nodes — Runs
assign-nodes-to-users.shto label one node per user (unless--no-assignis passed).
./scripts/workshop-setup.sh [--dry-run] [--no-assign] <number_of_users> [password]number_of_users— Number of users (user1..userN) and projects to createpassword— Optional. If omitted, a random password is generated and shown in the output--dry-run— Preview all actions without making any changes--no-assign— Skip node assignment (useful when nodes are pre-assigned or not needed)CUSTOM_PROJECT— Optional env var (default:llama-stack-demo)HTPASSWD_OUTPUT— Optional env var for htpasswd file path (default:htpasswd.workshopin repo root)INSTANCE_TYPE— Optional env var for node assignment (default:g5.2xlarge); see Instance Type below
The script always generates the htpasswd file in dry-run mode. You must apply it manually to configure OAuth:
- During Step 1 of
workshop-setup.sh, the htpasswd file is written tohtpasswd.workshopin the repository root (or the path inHTPASSWD_OUTPUT). - Follow the instructions printed by the script. Either:
- Option A (manual): Create the secret and update OAuth:
oc create secret generic htpasswd-secret --from-file=htpasswd=htpasswd.workshop -n openshift-config --dry-run=client -o yaml | oc apply -f - oc edit oauth cluster # Add HTPasswd identity provider with htpasswd.fileData.name: htpasswd-secret
- Option B (automatic): Run
setup-htpasswd-oauth.shwithout dry-run:Use the password from the workshop-setup output if one was generated../scripts/setup-htpasswd-oauth.sh <number_of_users> <password>
- Option A (manual): Create the secret and update OAuth:
Use --dry-run to preview all actions without making any changes:
./scripts/workshop-setup.sh --dry-run 5What dry-run does:
- Step 1: Generates the htpasswd file and prints Administrator instructions — no secret or OAuth changes
- Steps 2–4: Skipped (no projects, group, or node assignment)
After dry-run:
- Review the output to confirm user count, password, and project names
- Run without
--dry-runto create projects, group, and assign nodes:./scripts/workshop-setup.sh 5 mypassword
- Apply the htpasswd file using the instructions from Step 1 (see above)
If you do not need GPU node assignment (e.g. nodes are pre-configured or using a different setup):
./scripts/workshop-setup.sh --no-assign 5When assigning nodes, the script filters nodes by Kubernetes instance type (e.g. node.kubernetes.io/instance-type). The default is g5.2xlarge (AWS GPU instance). If your cluster uses different GPU or instance types, set INSTANCE_TYPE before running:
export INSTANCE_TYPE="g5.2xlarge" # default; AWS NVIDIA GPU
./scripts/workshop-setup.sh 5Or when running assign-nodes-to-users.sh directly:
./scripts/assign-nodes-to-users.sh 5 g5.2xlargeCommon values: g5.2xlarge (AWS), n1-standard-4 (GCP), Standard_NC4as_T4_v3 (Azure).
Steps 2–5 are idempotent. You can re-run the script (without --dry-run) and it will:
- Update labels on existing projects
- Add users to the group (no duplicates)
- Re-apply admin role bindings
- Assign nodes only for users that do not yet have one
After workshop setup:
-
Each user logs in with
user1..userNand the provided password -
Each user installs the demo in their project:
PROJECT="llama-stack-demo-user1" # replace with userN helm install llama-stack-demo helm/ -f helm/values.yaml --set assigned="${PROJECT}" --namespace ${PROJECT} --timeout 20m
For GPU node assignment details, node labels, and other configuration, see README.md.
On GPU workers pull model and vllm images.
./scripts/pull-image-on-assigned-gpu-nodes.sh \
registry.redhat.io/rhelai1/modelcar-qwen3-8b-fp8-dynamic:1.5 \
registry.redhat.io/rhaiis/vllm-cuda-rhel9@sha256:ec799bb5eeb7e25b4b25a8917ab5161da6b6f1ab830cbba61bba371cffb0c34dOn workers pull images used in pipelines.
./scripts/pull-image-on-assigned-gpu-nodes.sh quay.io/modh/odh-pipeline-runtime-pytorch-cuda-py312-ubi9@sha256:72ff2381e5cb24d6f549534cb74309ed30e92c1ca80214669adb78ad30c5ae12 --label node.kubernetes.io/instance-type=m7i.2xlarge,node-role.kubernetes.io/worker --parallel 8