Skip to content

Commit 20e9324

Browse files
committed
Fixes
1 parent 7f38ec7 commit 20e9324

1 file changed

Lines changed: 23 additions & 31 deletions

File tree

scenarios/AksKaito/README.md

Lines changed: 23 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -79,10 +79,19 @@ It takes a few minutes for the registration to complete.
7979
az feature register --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview"
8080
```
8181

82+
## Verify the AI toolchain operator add-on registration
83+
8284
Verify the registration using the [az feature show][az-feature-show] command.
8385

8486
```bash
85-
az feature show --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview"
87+
while true; do
88+
status=$(az feature show --namespace "Microsoft.ContainerService" --name "AIToolchainOperatorPreview" --query "properties.state" -o tsv)
89+
if [ "$status" == "Registered" ]; then
90+
break
91+
else
92+
sleep 30
93+
fi
94+
done
8695
```
8796

8897
## Create an AKS cluster with the AI toolchain operator add-on enabled
@@ -106,6 +115,8 @@ az aks create --location ${REGION} \
106115
--k8s-support-plan KubernetesOfficial
107116
```
108117

118+
## Enable AI toolchain operator for cluster
119+
109120
On an existing AKS cluster, you can enable the AI toolchain operator add-on using the [az aks update][az-aks-update] command.
110121

111122
```bash
@@ -171,47 +182,28 @@ az identity federated-credential create --name "kaito-federated-identity" \
171182

172183
## Verify that your deployment is running
173184

174-
1. Restart the KAITO GPU provisioner deployment on your pods using the `kubectl rollout restart` command:
175-
176-
```bash
177-
kubectl rollout restart deployment/kaito-gpu-provisioner -n kube-system
178-
```
179-
180-
2. Verify that the deployment is running using the `kubectl get` command:
185+
Restart the KAITO GPU provisioner deployment on your pods using the `kubectl rollout restart` command:
181186

182-
```bash
183-
kubectl get deployment -n kube-system | grep kaito
184-
```
187+
```bash
188+
kubectl rollout restart deployment/kaito-gpu-provisioner -n kube-system
189+
```
185190

186191
## Deploy a default hosted AI model
187192

188-
1. Deploy the Falcon 7B-instruct model from the KAITO model repository using the `kubectl apply` command.
189-
190-
```bash
191-
kubectl apply -f https://raw.githubusercontent.com/Azure/kaito/main/examples/inference/kaito_workspace_falcon_7b-instruct.yaml
192-
```
193+
Deploy the Falcon 7B-instruct model from the KAITO model repository using the `kubectl apply` command.
193194

194-
2. Track the live resource changes in your workspace using the `kubectl get` command.
195-
196-
```bash
197-
kubectl get workspace workspace-falcon-7b-instruct -w
198-
```
199-
200-
> [!NOTE]
201-
> As you track the live resource changes in your workspace, note that machine readiness can take up to 10 minutes, and workspace readiness up to 20 minutes.
202-
203-
3. Check your service and get the service IP address using the `kubectl get svc` command.
204-
205-
```bash
206-
export SERVICE_IP=$(kubectl get svc workspace-falcon-7b-instruct -o jsonpath='{.spec.clusterIP}')
207-
```
195+
```bash
196+
kubectl apply -f https://raw.githubusercontent.com/Azure/kaito/main/examples/inference/kaito_workspace_falcon_7b-instruct.yaml
197+
```
208198

209199
## Ask a question
210200

211201
Run the Falcon 7B-instruct model with a sample input of your choice using the following `curl` command:
202+
`kubectl get workspace workspace-falcon-7b-instruct -w`. Store IP: `export SERVICE_IP=$(kubectl get svc workspace-falcon-7b-instruct -o jsonpath='{.spec.clusterIP}')`.
203+
Ask question: `kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X POST http://$SERVICE_IP/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"`
212204

213205
```bash
214-
kubectl run -it --rm --restart=Never curl --image=curlimages/curl -- curl -X POST http://$SERVICE_IP/chat -H "accept: application/json" -H "Content-Type: application/json" -d "{\"prompt\":\"YOUR QUESTION HERE\"}"
206+
echo "See last step for details on how to ask questions to the model.
215207
```
216208
217209
## Next steps

0 commit comments

Comments
 (0)