Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/values_inheritance_pattern.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,13 @@ When ArgoCD renders applications with multi-source:
3. **External overrides** from `cluster-values/values.yaml`
4. **Runtime parameters** (domain, targetRevision) injected by bootstrap

AIM hardware-family selection is injected separately: cluster-bloom writes the
selected families as a YAML list into
`apps.aim-cluster-model-source.valuesObject.hardwareFamilies` (see
`sources/aim-cluster-model-source`). The value travels as a structured list,
not a string, so no comma parsing is involved. The base `root/values.yaml`
default is an empty list, which selects the legacy (install-all) branch.

## Developer Workflow

### Local Configuration Management (Local Mode)
Expand Down
5 changes: 5 additions & 0 deletions root/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ apps:
namespace: kaiwo-system
path: aim-cluster-model-source
syncWave: -20
valuesObject:
# Hardware families to install (cpu,epyc,instinct,radeon).
# Empty = legacy (install all generic model sources). cluster-bloom
# injects the selected families as a YAML list at deploy time.
hardwareFamilies: []
airm:
repoURL: "{{ .Values.ociRegistry.dockerHub }}"
repoVersion: "1.1.9"
Expand Down
8 changes: 8 additions & 0 deletions sources/aim-cluster-model-source/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright © Advanced Micro Devices, Inc., or its affiliates.
#
# SPDX-License-Identifier: MIT

apiVersion: v2
name: aim-cluster-model-source
version: 0.1.0
description: AIMClusterModelSource catalog with GPU-family profile selection
59 changes: 59 additions & 0 deletions sources/aim-cluster-model-source/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
<!--
Copyright © Advanced Micro Devices, Inc., or its affiliates.

SPDX-License-Identifier: MIT
-->

# aim-cluster-model-source

Helm chart that installs `AIMClusterModelSource` resources. It renders one of
two mutually exclusive branches, selected by `hardwareFamilies`:

- **Legacy (default):** when `hardwareFamilies` is empty, the chart installs the
full set of generic `amd-aim-release-*` model sources (versions 0.8.5, 0.9.0,
0.10.0, 0.11.0), unchanged from the pre-chart directory app.
- **Per-hardware-family profiles:** when `hardwareFamilies` is non-empty, the
chart installs only the `AIMClusterModelSource` resources for the listed
families. The legacy generic sources are not installed.

## `hardwareFamilies`

A YAML list (the primary form) or a comma-separated string. Allowed values:
`cpu`, `epyc`, `instinct`, `radeon`. Empty (the default) selects the legacy
branch.

```yaml
hardwareFamilies:
- epyc
- instinct
```

| Family | Source name | Registry | Notes |
|---|---|---|---|
| `instinct` | `amd-aim-instinct-0.12.0` | docker.io | works today |
| `epyc` | `amd-aim-epyc-0.11.0` | docker.io | works today |
| `cpu` | `amd-aim-cpu-0.12.0-rc1` | ghcr.io | placeholder, image pull fails until a docker.io release exists |
| `radeon` | `amd-aim-radeon-0.12.0-rc1` | ghcr.io | placeholder, image pull fails until a docker.io release exists |

`instinct` and `radeon` are GPU families; `cpu` and `epyc` are CPU inference
targets. `cpu` and `radeon` are only available as `-rc1` on `ghcr.io` and
require the `ghcr-regcred` pull secret, which this cluster does not provision.
They are pinned as placeholders, their pull will fail until the team publishes a
docker.io version.

## Installing

This chart is normally driven by cluster-bloom via the `AIM_HARDWARE_FAMILY`
install flag, which injects the selected families as a YAML list into
`apps.aim-cluster-model-source.valuesObject.hardwareFamilies` (see the
cluster-forge `root` chart). No comma parsing is involved on that path, the
value travels as a structured list.

For a manual `helm` install, prefer a values file or pass a JSON list. A
comma-separated string also works because the chart splits it, but note that
Helm's `--set` and `--set-string` both treat a comma as a list separator and
will silently drop a multi-value string, so use `--set-json` for the list form:

```bash
helm install ... --set-json 'hardwareFamilies=["epyc","instinct"]'
```
32 changes: 0 additions & 32 deletions sources/aim-cluster-model-source/aim-models-0.10.0.yaml

This file was deleted.

36 changes: 0 additions & 36 deletions sources/aim-cluster-model-source/aim-models-0.11.0.yaml

This file was deleted.

22 changes: 0 additions & 22 deletions sources/aim-cluster-model-source/aim-models-0.8.5.yaml

This file was deleted.

17 changes: 0 additions & 17 deletions sources/aim-cluster-model-source/aim-models-0.9.0.yaml

This file was deleted.

22 changes: 22 additions & 0 deletions sources/aim-cluster-model-source/templates/_helpers.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{{/*
Normalize .Values.hardwareFamilies into a clean list of family tokens.
Accepts a native list (the primary path, injected by cluster-bloom) or a
comma-separated string. Trims whitespace and drops empty tokens. Empty input
yields an empty list, which triggers the legacy branch.
*/}}
{{- define "aim.hardwareFamilies" -}}
{{- $raw := .Values.hardwareFamilies -}}
{{- $out := list -}}
{{- if kindIs "string" $raw -}}
{{- range (splitList "," $raw) -}}
{{- $t := trim . -}}
{{- if $t -}}{{- $out = append $out $t -}}{{- end -}}
{{- end -}}
{{- else if kindIs "slice" $raw -}}
{{- range $raw -}}
{{- $t := trim (toString .) -}}
{{- if $t -}}{{- $out = append $out $t -}}{{- end -}}
{{- end -}}
{{- end -}}
{{- $out | toJson -}}
{{- end -}}
101 changes: 101 additions & 0 deletions sources/aim-cluster-model-source/templates/legacy.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
{{- $families := include "aim.hardwareFamilies" . | fromJsonArray }}
{{- if not $families }}
# Copyright © Advanced Micro Devices, Inc., or its affiliates.
#
# SPDX-License-Identifier: MIT

apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-release-0.8.5
spec:
filters:
- image: amdenterpriseai/aim-meta-llama-llama-3-1-405b-instruct:0.8.5
- image: amdenterpriseai/aim-meta-llama-llama-3-1-8b-instruct:0.8.5
- image: amdenterpriseai/aim-meta-llama-llama-3-2-1b-instruct:0.8.5
- image: amdenterpriseai/aim-meta-llama-llama-3-2-3b-instruct:0.8.5
- image: amdenterpriseai/aim-meta-llama-llama-3-3-70b-instruct:0.8.5-preview
- image: amdenterpriseai/aim-mistralai-mistral-small-3-2-24b-instruct-2506:0.8.5
- image: amdenterpriseai/aim-mistralai-mixtral-8x22b-instruct-v0-1:0.8.5
- image: amdenterpriseai/aim-mistralai-mixtral-8x7b-instruct-v0-1:0.8.5
- image: amdenterpriseai/aim-qwen-qwen3-32b:0.8.5
maxModels: 100
registry: docker.io
syncInterval: 1h
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-release-0.9.0
spec:
filters:
- image: amdenterpriseai/aim-coherelabs-command-a-reasoning-08-2025:0.9.0
- image: amdenterpriseai/aim-openai-gpt-oss-20b:0.9.0
- image: amdenterpriseai/aim-openai-gpt-oss-120b:0.9.0
- image: amdenterpriseai/aim-qwen-qwen3-235b-a22b:0.9.0
maxModels: 100
registry: docker.io
syncInterval: 1h
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-release-0.10.0
spec:
filters:
- image: amdenterpriseai/aim-coherelabs-command-a-reasoning-08-2025:0.10.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-r1-0528:0.10.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-r1:0.10.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-v3-1-terminus:0.10.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-v3-1:0.10.0
- image: amdenterpriseai/aim-meta-llama-llama-3-1-405b-instruct:0.10.0
- image: amdenterpriseai/aim-meta-llama-llama-3-1-8b-instruct:0.10.0
- image: amdenterpriseai/aim-meta-llama-llama-3-2-1b-instruct:0.10.0
- image: amdenterpriseai/aim-meta-llama-llama-3-2-3b-instruct:0.10.0
- image: amdenterpriseai/aim-meta-llama-llama-3-3-70b-instruct:0.10.0
- image: amdenterpriseai/aim-mistralai-ministral-3-14b-instruct-2512:0.10.0
- image: amdenterpriseai/aim-mistralai-mistral-large-3-675b-instruct-2512:0.10.0
- image: amdenterpriseai/aim-mistralai-mistral-small-3-2-24b-instruct-2506:0.10.0
- image: amdenterpriseai/aim-mistralai-mixtral-8x22b-instruct-v0-1:0.10.0
- image: amdenterpriseai/aim-mistralai-mixtral-8x7b-instruct-v0-1:0.10.0
- image: amdenterpriseai/aim-openai-gpt-oss-120b:0.10.0
- image: amdenterpriseai/aim-openai-gpt-oss-20b:0.10.0
- image: amdenterpriseai/aim-qwen-qwen3-235b-a22b:0.10.0
- image: amdenterpriseai/aim-qwen-qwen3-32b:0.10.0
maxModels: 100
registry: docker.io
syncInterval: 1h
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-release-0.11.0
spec:
filters:
- image: amdenterpriseai/aim-coherelabs-command-a-reasoning-08-2025:0.11.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-r1:0.11.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-r1-0528:0.11.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-v3-1:0.11.0
- image: amdenterpriseai/aim-deepseek-ai-deepseek-v3-1-terminus:0.11.0
- image: amdenterpriseai/aim-google-gemma-3-27b-it:0.11.0
- image: amdenterpriseai/aim-meta-llama-llama-3-1-405b-instruct:0.11.0
- image: amdenterpriseai/aim-meta-llama-llama-3-1-8b-instruct:0.11.0
- image: amdenterpriseai/aim-meta-llama-llama-3-2-1b-instruct:0.11.0
- image: amdenterpriseai/aim-meta-llama-llama-3-2-3b-instruct:0.11.0
- image: amdenterpriseai/aim-meta-llama-llama-3-3-70b-instruct:0.11.0
- image: amdenterpriseai/aim-mistralai-ministral-3-14b-instruct-2512:0.11.0
- image: amdenterpriseai/aim-mistralai-ministral-3-14b-reasoning-2512:0.11.0
- image: amdenterpriseai/aim-mistralai-mistral-small-3-2-24b-instruct-2506:0.11.0
- image: amdenterpriseai/aim-mistralai-mixtral-8x22b-instruct-v0-1:0.11.0
- image: amdenterpriseai/aim-mistralai-mixtral-8x7b-instruct-v0-1:0.11.0
- image: amdenterpriseai/aim-openai-gpt-oss-120b:0.11.0
- image: amdenterpriseai/aim-openai-gpt-oss-20b:0.11.0
- image: amdenterpriseai/aim-qwen-qwen3-235b-a22b:0.11.0
- image: amdenterpriseai/aim-qwen-qwen3-32b:0.11.0
- image: amdenterpriseai/aim-qwen-qwen3-coder-next:0.11.0
- image: amdenterpriseai/aim-qwen-qwen3-vl-235b-a22b-instruct:0.11.0
- image: amdenterpriseai/aim-qwen-qwen3-vl-235b-a22b-thinking:0.11.0
maxModels: 100
registry: docker.io
syncInterval: 1h
{{- end }}
68 changes: 68 additions & 0 deletions sources/aim-cluster-model-source/templates/profiles.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
{{- $families := include "aim.hardwareFamilies" . | fromJsonArray }}
{{- if has "instinct" $families }}
# Copyright © Advanced Micro Devices, Inc., or its affiliates.
#
# SPDX-License-Identifier: MIT

apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-instinct-0.12.0
spec:
filters:
- image: amdenterpriseai/aim-google-gemma-3-1b-it:0.12.0
- image: amdenterpriseai/aim-minimaxai-minimax-m2-5:0.12.0
- image: amdenterpriseai/aim-zai-org-glm-4-7:0.12.0
maxModels: 100
registry: docker.io
syncInterval: 1h
{{- end }}
{{- if has "epyc" $families }}
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-epyc-0.11.0
spec:
filters:
- image: amdenterpriseai/aim-epyc-meta-llama-llama-3-1-8b-instruct:0.11.0-preview
- image: amdenterpriseai/aim-epyc-meta-llama-llama-3-2-1b-instruct:0.11.0-preview
- image: amdenterpriseai/aim-epyc-meta-llama-llama-3-2-3b-instruct:0.11.0-preview
maxModels: 100
registry: docker.io
syncInterval: 1h
{{- end }}
{{- if has "cpu" $families }}
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-cpu-0.12.0-rc1
spec:
filters:
- image: silogen/aim-cpu-qwen-qwen3-0-6b:0.12.0-rc1
imagePullSecrets:
- name: ghcr-regcred
maxModels: 100
registry: ghcr.io
syncInterval: 1h
{{- end }}
{{- if has "radeon" $families }}
---
apiVersion: aim.eai.amd.com/v1alpha1
kind: AIMClusterModelSource
metadata:
name: amd-aim-radeon-0.12.0-rc1
spec:
filters:
- image: silogen/aim-radeon-meta-llama-llama-3-1-8b-instruct:0.12.0-rc1
- image: silogen/aim-radeon-openai-gpt-oss-20b:0.12.0-rc1
- image: silogen/aim-radeon-qwen-qwen3-5-9b:0.12.0-rc1
- image: silogen/aim-radeon-qwen-qwen3-vl-8b-instruct:0.12.0-rc1
- image: silogen/aim-radeon-zai-org-glm-4-7-flash:0.12.0-rc1
imagePullSecrets:
- name: ghcr-regcred
maxModels: 100
registry: ghcr.io
syncInterval: 1h
{{- end }}
Loading
Loading