diff --git a/docs/docs/how-tos/deploy.mdx b/docs/docs/how-tos/deploy.mdx index c9db8502..39dd5477 100644 --- a/docs/docs/how-tos/deploy.mdx +++ b/docs/docs/how-tos/deploy.mdx @@ -41,6 +41,8 @@ After deploy completes, your cluster needs DNS records to be reachable: see [Clo To change a running cluster, edit your config and re-run `nic deploy`: see [Update a cluster](/docs/how-tos/update-cluster). +To upgrade the cluster's Kubernetes version, see [Upgrade Kubernetes version](/docs/how-tos/upgrade-kubernetes) — version bumps have additional constraints beyond a normal config change. + ## Destroy When you're done with the cluster, tear it down with `nic destroy`: see [Destroy a cluster](/docs/how-tos/destroy-cluster). diff --git a/docs/docs/how-tos/providers/aws.mdx b/docs/docs/how-tos/providers/aws.mdx index 65430d68..009e6c0a 100644 --- a/docs/docs/how-tos/providers/aws.mdx +++ b/docs/docs/how-tos/providers/aws.mdx @@ -181,10 +181,39 @@ To change something about a running cluster (scale a node group, add a `gpu` gro Changing `region`, `project_name`, or `vpc_cidr_block` triggers destructive resource recreation. Treat these as one-way decisions. ::: +## Upgrade Kubernetes version + +EKS requires one-minor-version increments — you cannot skip versions or downgrade. To go from 1.32 to 1.34 you must upgrade twice: + +``` +1.32 → 1.33 → 1.34 +``` + +Skipping a version (for example, 1.32 → 1.34 directly) is rejected by EKS during deploy. + +Set `kubernetes_version` under `cluster.aws` in your config: + +```yaml +cluster: + aws: + kubernetes_version: "1.33" # was "1.32" +``` + +EKS accepts bare minor versions (`"1.33"`) — patch versions are not required. + +EKS upgrades in two phases: + +1. **Control plane** upgrades first. The Kubernetes API server and control-plane components are updated to the new version. This takes around 10 minutes and is handled by AWS. +2. **Node groups** roll one at a time. EKS launches a replacement node, waits for it to become Ready, then drains and terminates the old one — one node at a time per node group. Node groups upgrade sequentially after the control plane finishes. Plan for 20–40 minutes total depending on the number of node groups. + +After upgrading, verify your EKS managed add-ons (kube-proxy, CoreDNS, VPC CNI) are on versions compatible with the new cluster version — check the **Add-ons** tab in the AWS EKS console and update any that are flagged. + +See [Upgrade Kubernetes version](/docs/how-tos/upgrade-kubernetes) for the upgrade commands and post-upgrade verification steps. + ## Destroy Run the destroy commands as described in [Destroy a cluster](/docs/how-tos/destroy-cluster). `nic destroy` removes EKS, node groups, EFS, VPC components, and the state bucket. If a resource fails to delete (commonly, a leftover load balancer from the cluster's ingress), remove it manually in the AWS console before retrying with `--force`. -Always confirm in the AWS console that no orphan resources remain. NAT gateways, load balancers, and EBS volumes can keep billing if they're not cleaned up. \ No newline at end of file +Always confirm in the AWS console that no orphan resources remain. NAT gateways, load balancers, and EBS volumes can keep billing if they're not cleaned up. diff --git a/docs/docs/how-tos/providers/hetzner.mdx b/docs/docs/how-tos/providers/hetzner.mdx index c256b225..6f7a3dbb 100644 --- a/docs/docs/how-tos/providers/hetzner.mdx +++ b/docs/docs/how-tos/providers/hetzner.mdx @@ -148,6 +148,14 @@ To change something about a running cluster (scale a node group, add a `gpu` gro Changing `location` or `project_name` triggers destructive resource recreation. Treat these as one-way decisions. ::: +## Upgrade Kubernetes version + +The `kubernetes_version` field accepts `"1.32"`, `"1.32.0"`, or a full release tag like `"v1.32.12+k3s1"`. Specifying `"1.32"` resolves to the latest stable k3s patch release for that minor version. + +`hetzner-k3s` handles the rolling upgrade: it upgrades the control plane node first, then each worker node in turn. + +See [Upgrade Kubernetes version](/docs/how-tos/upgrade-kubernetes) for the upgrade commands and post-upgrade verification steps. + ## Destroy Run the destroy commands as described in [Destroy](/docs/how-tos/deploy#destroy) in the deploy lifecycle guide. diff --git a/docs/docs/how-tos/update-cluster.mdx b/docs/docs/how-tos/update-cluster.mdx index f185fe02..610c8a05 100644 --- a/docs/docs/how-tos/update-cluster.mdx +++ b/docs/docs/how-tos/update-cluster.mdx @@ -15,6 +15,10 @@ nic deploy -f --dry-run # show what would change; nothing is mo nic deploy -f # apply it ``` +:::note[Upgrading the Kubernetes version] +Bumping `kubernetes_version` has additional requirements. See [Upgrade Kubernetes version](/docs/how-tos/upgrade-kubernetes) +::: + ## Immutable fields Some fields (cluster identity, network foundation) can't be changed on a running cluster. Changing one requires destroying and recreating it. The exact fields are provider-specific: see your [provider's page](/docs/how-tos/providers). diff --git a/docs/docs/how-tos/upgrade-kubernetes.mdx b/docs/docs/how-tos/upgrade-kubernetes.mdx new file mode 100644 index 00000000..2fb8669c --- /dev/null +++ b/docs/docs/how-tos/upgrade-kubernetes.mdx @@ -0,0 +1,75 @@ +--- +title: Upgrade Kubernetes version +slug: /how-tos/upgrade-kubernetes +description: How to upgrade the Kubernetes version of an NKP cluster — common steps that work across all providers. +--- + +:::note +This page covers NKP clusters managed by `nic`. Classic clusters are not covered here. +::: + +NKP supports in-place Kubernetes version upgrades via `nic deploy`. Edit the `kubernetes_version` field in your config and re-run deploy — `nic` handles the rest. + +## Before you upgrade + +Version constraints and accepted formats vary by provider — see your [provider's page](/docs/how-tos/providers) before upgrading. + +The Nebari Operator and Software Packs run inside the cluster and follow its Kubernetes version. There is no separate compatibility matrix to check before upgrading. + +## Upgrade steps + +These steps are the same for all providers. + +1. Edit `kubernetes_version` in your config. See your [provider's page](/docs/how-tos/providers) for provider specific requirements. + +2. Validate the config: + + ```bash + nic validate -f + ``` + +3. Preview what will change: + + ```bash + nic deploy -f --dry-run + ``` + +4. Apply the upgrade: + + ```bash + nic deploy -f + ``` + +## Operator and pack considerations + +During node rolling, pods on the node being replaced are evicted and rescheduled elsewhere: + +- The **Nebari Operator** may be briefly unavailable while the node it runs on is replaced. +- **Software Packs**, ArgoCD, Keycloak, and other foundational apps may be briefly unavailable during the rollout of the node they run on. +- Once **ArgoCD's replacement pod is ready**, any apps that fell out of sync during the rollout are restored automatically. + +Multi-replica workloads tolerate rolling node replacement without downtime, provided their replicas are spread across nodes. Single-replica packs will have a brief interruption while their pod is rescheduled. + +## Verify the upgrade + +After deploy completes, check that all nodes are running the new version: + +```bash +kubectl get nodes -o wide +``` + +All nodes should show the target version in the `VERSION` column. Then verify ArgoCD apps are healthy: + +```bash +kubectl get applications -n argocd +``` + +All applications should reach `Healthy`. Any that are briefly `Progressing` after the upgrade will self-correct within a few minutes. + +## See also + +- [NKP architecture](/docs/explanations/nkp-architecture) — conceptual overview of how the platform layers fit together +- [Deploy lifecycle](/docs/how-tos/deploy) — full deploy, update, and destroy reference +- [Update a cluster](/docs/how-tos/update-cluster) — other config changes you can make to a running cluster +- [AWS provider](/docs/how-tos/providers/aws) — EKS-specific upgrade constraints and mechanics +- [Hetzner provider](/docs/how-tos/providers/hetzner) — k3s-specific upgrade constraints and mechanics diff --git a/docs/sidebars.js b/docs/sidebars.js index 6be21d5b..74c403a8 100644 --- a/docs/sidebars.js +++ b/docs/sidebars.js @@ -47,6 +47,7 @@ module.exports = { "how-tos/deploy-cluster", "how-tos/cloudflare-dns", "how-tos/update-cluster", + "how-tos/upgrade-kubernetes", "how-tos/destroy-cluster", "how-tos/keycloak-auth", {