Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/workflows/manual-trigger-job.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,4 @@ jobs:
uses: azure/login@v2
with:
creds: ${{secrets.AZURE_CREDENTIALS}}
- name: Run Azure Machine Learning training job
run: az ml job create -f src/job.yml --stream --resource-group ${{vars.AZURE_RESOURCE_GROUP}} --workspace-name ${{vars.AZURE_WORKSPACE_NAME}}


8 changes: 1 addition & 7 deletions .github/workflows/train-dev.yml
Original file line number Diff line number Diff line change
@@ -1,13 +1,7 @@
name: Train model in dev (PR)
name: Train model in dev

on:
workflow_dispatch:
pull_request:
branches:
- main
paths:
- 'src/train-model-parameters.py'
- 'src/job.yml'

permissions:
contents: read
Expand Down
220 changes: 119 additions & 101 deletions docs/05-plan-and-prepare.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,34 +32,34 @@ You can manually create necessary resources and assets to work with Azure Machin
1. Check that the correct subscription is specified and that **No storage account required** is selected. Select **Apply**.
1. In the terminal, enter the following commands to clone this repo:

```azurecli
rm -r mslearn-mlops -f
git clone https://github.com/MicrosoftLearning/mslearn-mlops.git mslearn-mlops
```
```azurecli
rm -r mslearn-mlops -f
git clone https://github.com/MicrosoftLearning/mslearn-mlops.git mslearn-mlops
```

> Use `SHIFT + INSERT` to paste your copied code into the Cloud Shell.
> Use `SHIFT + INSERT` to paste your copied code into the Cloud Shell.

1. After the repo has been cloned, enter the following commands to change to the `infra` folder and open the setup script:

```azurecli
cd mslearn-mlops/infra
code setup.sh
```
```azurecli
cd mslearn-mlops/infra
code setup.sh
```

> [!NOTE]
> If the `code` command is not available, you are in the new Cloud Shell experience. Switch to Classic Cloud Shell by selecting **Switch to Classic Cloud Shell** in the toolbar and selecting **Confirm**. Then run the commands again.
> [!NOTE]
> If the `code` command is not available, you are in the new Cloud Shell experience. Switch to Classic Cloud Shell by selecting **Switch to Classic Cloud Shell** in the toolbar and selecting **Confirm**. Then run the commands again.

1. Review the script and identify the resources that are created for your current **development** environment:
- A resource group with a randomized suffix, for example `rg-ai300-l...`.
- An Azure Machine Learning workspace, for example `mlw-ai300-l...`.
- A compute instance for interactive work.
- A compute cluster for training jobs.
- Data assets for the diabetes training data in the `data/diabetes-data` folder.
- A resource group with a randomized suffix, for example `rg-ai300-l...`.
- An Azure Machine Learning workspace, for example `mlw-ai300-l...`.
- A compute instance for interactive work.
- A compute cluster for training jobs.
- Data assets for the diabetes training data in the `data/diabetes-data` folder.

1. Note how the script:
- Generates a random suffix to avoid name collisions.
- Registers the **Microsoft.MachineLearningServices** resource provider.
- Sets default values for the resource group and workspace so subsequent `az ml` commands use them automatically.
- Generates a random suffix to avoid name collisions.
- Registers the **Microsoft.MachineLearningServices** resource provider.
- Sets default values for the resource group and workspace so subsequent `az ml` commands use them automatically.

By understanding what this script does for development, you're ready to think about what you would change or add for production.

Expand Down Expand Up @@ -96,75 +96,93 @@ Next, you map your target architecture to Azure CLI commands. Instead of running

1. In the Cloud Shell editor, create a new file based on the existing script so you can experiment safely:

```bash
cp setup.sh setup-prod-design.sh
code setup-prod-design.sh
```
```bash
cp setup.sh setup-prod-design.sh
code setup-prod-design.sh
```

1. At the top of the new file, add variables for both environments and the shared registry. For example:

```bash
# Existing random suffix
guid=$(cat /proc/sys/kernel/random/uuid)
suffix=${guid//[-]/}
suffix=${suffix:0:18}
```bash
# Existing random suffix
guid=$(cat /proc/sys/kernel/random/uuid)
suffix=${guid//[-]/}
suffix=${suffix:0:18}

# Dev environment
DEV_RESOURCE_GROUP="rg-ai300-dev-${suffix}"
DEV_WORKSPACE_NAME="mlw-ai300-dev-${suffix}"
# Dev environment
DEV_RESOURCE_GROUP="rg-ai300-dev-${suffix}"
DEV_WORKSPACE_NAME="mlw-ai300-dev-${suffix}"

# Prod environment
PROD_RESOURCE_GROUP="rg-ai300-prod-${suffix}"
PROD_WORKSPACE_NAME="mlw-ai300-prod-${suffix}"
# Prod environment
PROD_RESOURCE_GROUP="rg-ai300-prod-${suffix}"
PROD_WORKSPACE_NAME="mlw-ai300-prod-${suffix}"

# Shared registry (one per subscription/region)
REGISTRY_RESOURCE_GROUP="rg-ai300-reg-${suffix}"
REGISTRY_NAME="mlr-ai300-shared-${suffix}"
```
# Shared registry (one per subscription/region)
REGISTRY_RESOURCE_GROUP="rg-ai300-reg-${suffix}"
REGISTRY_NAME="mlr-ai300-shared-${suffix}"
```

1. In the `infra` folder, open `registry.yml` and review the values that define the shared registry. The Azure CLI reads this YAML file literally, so the Bash script needs to inject the dynamic registry name and primary region into the file before running the create command. In this lab, use placeholders in `registry.yml` like this:

```yml
name: REGISTRY_NAME_PLACEHOLDER
tags:
description: Shared registry for approved machine learning assets across workspaces
location: PRIMARY_REGION_PLACEHOLDER
replication_locations:
- location: PRIMARY_REGION_PLACEHOLDER
```

1. Plan the commands that would create the **shared registry** in its own resource group. For example:

```azurecli
# Create a resource group for the shared registry
az group create --name $REGISTRY_RESOURCE_GROUP --location $RANDOM_REGION
```azurecli
# Create a resource group for the shared registry
az group create --name $REGISTRY_RESOURCE_GROUP --location $RANDOM_REGION

# Render registry.yml with the dynamic values from the script
sed \
-e "s|REGISTRY_NAME_PLACEHOLDER|$REGISTRY_NAME|g" \
-e "s|PRIMARY_REGION_PLACEHOLDER|$RANDOM_REGION|g" \
registry.yml > registry.generated.yml

# Create an Azure Machine Learning registry from the rendered YAML file
az ml registry create \
--file registry.generated.yml \
--resource-group $REGISTRY_RESOURCE_GROUP
```

# Create an Azure Machine Learning registry
az ml registry create \
--name $REGISTRY_NAME \
--resource-group $REGISTRY_RESOURCE_GROUP \
--location $RANDOM_REGION
```
The primary registry region appears twice in the YAML definition: once in `location` and again in `replication_locations`. Rendering the YAML from the Bash variables keeps those values consistent.

1. Plan the commands that would create the **production** resource group and workspace. They follow the same pattern as the existing dev workspace, but use the prod names:

```azurecli
# Create the production resource group
az group create --name $PROD_RESOURCE_GROUP --location $RANDOM_REGION
```azurecli
# Create the production resource group
az group create --name $PROD_RESOURCE_GROUP --location $RANDOM_REGION

# Create the production Azure Machine Learning workspace
az ml workspace create \
--name $PROD_WORKSPACE_NAME \
--resource-group $PROD_RESOURCE_GROUP \
--location $RANDOM_REGION
```
# Create the production Azure Machine Learning workspace
az ml workspace create \
--name $PROD_WORKSPACE_NAME \
--resource-group $PROD_RESOURCE_GROUP \
--location $RANDOM_REGION
```

1. Finally, plan the data assets that keep dev and prod data separated. Use the **dev** folder for experimentation and the **production** folder for production training:

```azurecli
# In the dev workspace: data asset that points to experimentation data
az configure --defaults group=$DEV_RESOURCE_GROUP workspace=$DEV_WORKSPACE_NAME
az ml data create \
--type uri_folder \
--name diabetes-dev-folder \
--path ../data/diabetes-data

# In the prod workspace: data asset that points to production data
az configure --defaults group=$PROD_RESOURCE_GROUP workspace=$PROD_WORKSPACE_NAME
az ml data create \
--type uri_folder \
--name diabetes-prod-folder \
--path ../production/data
```
```azurecli
# In the dev workspace: data asset that points to experimentation data
az configure --defaults group=$DEV_RESOURCE_GROUP workspace=$DEV_WORKSPACE_NAME
az ml data create \
--type uri_folder \
--name diabetes-dev-folder \
--path ../data/diabetes-data

# In the prod workspace: data asset that points to production data
az configure --defaults group=$PROD_RESOURCE_GROUP workspace=$PROD_WORKSPACE_NAME
az ml data create \
--type uri_folder \
--name diabetes-prod-folder \
--path ../production/data
```

> [!IMPORTANT]
> For this lab, you **don't need** to run the new commands that create extra resource groups and workspaces. Focus on understanding how you would structure the script so that dev and prod resources are clearly separated and production data stays out of the development environment. If you do want to run the script, follow the optional steps in the next section.
Expand All @@ -177,29 +195,29 @@ If you want to see your design in action, you can validate your script against a

1. In the Cloud Shell terminal, make sure you're in the `infra` folder:

```bash
cd mslearn-mlops/infra
```
```bash
cd mslearn-mlops/infra
```

1. The repo includes a reference script `infra/setup-mlops-envs.sh` that shows what the complete script should look like. Compare it with your own `setup-prod-design.sh` to check your work:

```bash
diff setup-prod-design.sh setup-mlops-envs.sh
```
```bash
diff setup-prod-design.sh setup-mlops-envs.sh
```

Review any differences and update your script if needed.
Review any differences and update your script if needed.

1. Once you're satisfied with your script, make it executable and run it:

```bash
chmod +x setup-prod-design.sh
./setup-prod-design.sh
```
```bash
chmod +x setup-prod-design.sh
./setup-prod-design.sh
```

1. When the script completes, verify the resources in the Azure portal:
- New resource groups for dev, prod, and the shared registry.
- Separate workspaces for dev and prod.
- Data assets `diabetes-dev-folder` and `diabetes-prod-folder` in the respective workspaces.
- New resource groups for dev, prod, and the shared registry.
- Separate workspaces for dev and prod.
- Data assets `diabetes-dev-folder` and `diabetes-prod-folder` in the respective workspaces.

1. When you're done exploring, be sure to delete any extra resource groups you created so you don't continue to incur charges.

Expand All @@ -209,29 +227,29 @@ In a real MLOps project, you want a single automation entry point that can provi

1. Decide how you would pass the **target environment** into the script. For example, you could accept a parameter such as `dev` or `prod`:

```bash
ENVIRONMENT=${1:-dev}
```
```bash
ENVIRONMENT=${1:-dev}
```

1. Based on the environment, plan how you would set the resource group and workspace variables. For example:

```bash
if [ "$ENVIRONMENT" = "prod" ]; then
RESOURCE_GROUP=$PROD_RESOURCE_GROUP
WORKSPACE_NAME=$PROD_WORKSPACE_NAME
else
RESOURCE_GROUP=$DEV_RESOURCE_GROUP
WORKSPACE_NAME=$DEV_WORKSPACE_NAME
fi
```
```bash
if [ "$ENVIRONMENT" = "prod" ]; then
RESOURCE_GROUP=$PROD_RESOURCE_GROUP
WORKSPACE_NAME=$PROD_WORKSPACE_NAME
else
RESOURCE_GROUP=$DEV_RESOURCE_GROUP
WORKSPACE_NAME=$DEV_WORKSPACE_NAME
fi
```

1. Think through which resources should be **shared** and which should be **isolated**:
- The registry is shared between dev and prod, so you'd create it once and reuse it.
- Workspaces, compute, and data assets are environment-specific so that you can apply different security and access controls.
- The registry is shared between dev and prod, so you'd create it once and reuse it.
- Workspaces, compute, and data assets are environment-specific so that you can apply different security and access controls.

1. Consider how this script would fit into your broader MLOps workflows:
- In **GitHub Actions**, you could call the script with `dev` when validating pull requests and `prod` when deploying approved models.
- In **local development**, data scientists could call the script with `dev` to recreate the experimentation environment from scratch.
- In **GitHub Actions**, you could call the script with `dev` when validating pull requests and `prod` when deploying approved models.
- In **local development**, data scientists could call the script with `dev` to recreate the experimentation environment from scratch.

You now have a clear plan for how to evolve the existing script into a more flexible provisioning tool without changing how earlier labs work.

Expand Down
Loading