Skip to content

Rename flyte-demo to flyte-devbox#7247

Merged
pingsutw merged 14 commits intov2from
rename-devbox
Apr 21, 2026
Merged

Rename flyte-demo to flyte-devbox#7247
pingsutw merged 14 commits intov2from
rename-devbox

Conversation

@pingsutw
Copy link
Copy Markdown
Member

@pingsutw pingsutw commented Apr 21, 2026

TL;DR

Rename all references of flyte-demo to flyte-devbox and demo-bundled to devbox-bundled.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

Renames across the entire codebase:

  • flyte-demoflyte-devbox (Helm chart name, Docker image tags, container/volume names, labels, service names, config values)
  • FLYTE_DEMOFLYTE_DEVBOX (env vars)
  • flytedemoflytedevbox (Helm template release name)
  • demo-bundleddevbox-bundled (directory names, Go module path, CI workflow references)
  • Makefile targets demo-build/run/stopdevbox-build/run/stop
  • Directory renames: charts/flyte-democharts/flyte-devbox, docker/demo-bundleddocker/devbox-bundled

Tracking Issue

NA

Follow-up issue

NA

pingsutw and others added 8 commits April 20, 2026 23:53
Adds a GPU variant of the demo-bundled image so users with NVIDIA GPUs
can run `flyte start demo --image ghcr.io/flyteorg/flyte-demo:gpu-latest`
and submit tasks with `Resources(gpu=1)`.

- Dockerfile.gpu stages NVIDIA Container Toolkit v1.19.x binaries and
  their shared libs into the rancher/k3s final image. Libs are copied
  into /usr/lib/<triple>/ because the nvidia-ctk OCI hook runs without
  inheriting LD_LIBRARY_PATH. A statically-linked /sbin/ldconfig is
  also staged (rancher/k3s ships none) because the toolkit's
  update-ldcache hook bind-mounts it into workload pods.
- containerd-config.toml.tmpl sets nvidia as the default containerd
  runtime. Pods requesting nvidia.com/gpu get GPUs without needing
  runtimeClassName in their spec; non-GPU pods are unaffected
  (nvidia-container-runtime is a passthrough when no GPU is requested).
- nvidia-device-plugin.yaml installs a RuntimeClass and the NVIDIA
  k8s-device-plugin DaemonSet so nvidia.com/gpu is advertised on the
  node. Auto-applied by k3s at startup.
- Makefile gains a build-gpu target producing flyte-demo:gpu-latest.
- CI gains a build-and-push step publishing gpu-latest, gpu-nightly,
  and gpu-<sha> tags to both flyte-demo and flyte-sandbox-v2.

The GPU plumbing was verified end-to-end with a layered test image on
an A10G (torch 2.11.0+cu130 reported cuda_available=True). The full
multi-stage Dockerfile.gpu has not been built locally; the CI run here
is the first end-to-end test of the production Dockerfile and may
need fixup iterations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the duplicated builder/bootstrap/pg-cache stages and final-stage
setup by making Dockerfile.gpu a thin layer on top of flyte-demo:latest
(parameterized via ARG BASE_IMAGE). CI now builds the CPU image first
and passes its sha-tag in as BASE_IMAGE to the GPU build.

- Dockerfile.gpu shrinks from ~165 to ~75 lines; inherits flyte-binary,
  embedded postgres, staging manifests, and k3d entrypoint from the
  base image unchanged.
- Makefile build-gpu target now depends on build (not the full prereq
  chain) and passes BASE_IMAGE=flyte-demo:latest.
- CI gates the GPU build on push/workflow_dispatch since PR builds
  don't push the CPU image to ghcr.io (nothing to pull for BASE_IMAGE).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops the `if:` gate and conditions `push:` on the same expression the
CPU build uses, so both steps always build and only push on v2-branch
pushes or workflow_dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On pull_request events the CPU build step runs with push=false, so the
GPU build's FROM ghcr.io/.../flyte-demo:sha-<sha> fails to resolve
(image not found in the registry). Fix by producing an OCI archive of
the CPU image locally and passing it to the GPU build as a named build
context (build-contexts: base=oci-layout://...) with BASE_IMAGE=base.

Registry push happens in a separate step that only runs on push /
workflow_dispatch, so PR builds no longer need ghcr credentials for
the GPU step.
Add GHA cache (type=gha) to the three docker/build-push-action steps in
build-and-push-demo-bundled-image. CPU archive and CPU push share the
demo-cpu scope so the push reuses layers from the archive build; GPU
gets its own demo-gpu scope.
The oci-layout:// build-context source requires Dockerfile frontend 1.5+.
CI was failing with 'unsupported context source oci-layout for base'.

Signed-off-by: Kevin Su <pingsutw@apache.org>
Rename all references of flyte-demo to flyte-devbox and demo-bundled
to devbox-bundled across Helm charts, Docker files, Makefiles, CI
workflows, Go source, and Kubernetes manifests.

Signed-off-by: Kevin Su <pingsutw@apache.org>
This was referenced Apr 21, 2026
Base automatically changed from feat/demo-bundled-gpu to v2 April 21, 2026 17:19
Resolve merge conflicts from GPU build steps added in v2, applying
the flyte-demo → flyte-devbox rename to the incoming changes.

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@pingsutw pingsutw self-assigned this Apr 21, 2026
@pingsutw pingsutw added this to the V2 GA milestone Apr 21, 2026
- Tag nightly on push to v2, latest only on manual workflow_dispatch
- Fix local build-gpu by using --build-context to pass CPU image into
  the docker-container BuildKit driver

Signed-off-by: Kevin Su <pingsutw@apache.org>
Signed-off-by: Kevin Su <pingsutw@apache.org>
@pingsutw pingsutw merged commit 61a555c into v2 Apr 21, 2026
20 checks passed
@pingsutw pingsutw deleted the rename-devbox branch April 21, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants