CD-29 - Add deploying custom models docs#4

Open

WashingtonKK wants to merge 5 commits intomainfrom

Contributor

WashingtonKK commented Feb 26, 2026 •

edited

Loading

What type of PR is this?

This is a docs update

What does this do?

Updates docs on deploying custom models

Which issue(s) does this PR fix/relate to?

Resolves Document deploying custom models with HAL and cloud init cube-docs#29

Have you included tests for your changes?

Did you document any new/modified features?

Notes

SammyOina requested a review from Copilot

February 26, 2026 11:13

Copilot started reviewing on behalf of SammyOina

February 26, 2026 11:13

Copilot AI reviewed

View reviewed changes

Copilot AI left a comment

Pull request overview

This PR adds comprehensive documentation for deploying custom and private models to Cube AI Confidential VMs (CVMs). It expands the previously minimal "private-model-upload.md" documentation to cover three distinct deployment approaches: build-time embedding via Buildroot, cloud-init provisioning for Ubuntu, and runtime upload over SSH.

Changes:

Comprehensive documentation for deploying custom models to CVMs using Ollama and vLLM backends
Detailed instructions for three deployment methods: Buildroot build-time embedding, cloud-init provisioning, and runtime SSH upload
Port reference table documenting the standard CVM network port mappings (6190→SSH, 6193→Cube Agent)

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File	Description
content/docs/developer-guide/private-model-upload.md	Complete rewrite expanding from 15 lines to 290 lines with detailed multi-platform deployment procedures, configuration examples, and verification steps
content/docs/developer-guide/index.md	Added descriptive subtitle to "Private Model Upload" menu item clarifying the three deployment methods

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

content/docs/developer/guide/private-model-upload.md

Comment on lines +85 to +107

    
              3. Add a startup script in the overlay to register the model after Ollama starts:

              ```bash

              mkdir -p cube/hal/buildroot/linux/board/cube/overlay/usr/libexec/ollama/

              cat > cube/hal/buildroot/linux/board/cube/overlay/usr/libexec/ollama/register-custom-models.sh << 'SCRIPT'

              #!/bin/sh

              # Wait for Ollama to be ready

              for i in $(seq 1 30); do

                if curl -s http://localhost:11434/api/version > /dev/null 2>&1; then

                  break

                fi

                sleep 2

              done

              # Register custom models from Modelfiles

              for mf in /etc/cube/modelfiles/*.Modelfile; do

                [ -f "$mf" ] || continue

                name=$(basename "$mf" .Modelfile)

                ollama create "$name" -f "$mf"

              done

              SCRIPT

              chmod +x cube/hal/buildroot/linux/board/cube/overlay/usr/libexec/ollama/register-custom-models.sh

              ```

Copilot AI Feb 26, 2026

The documentation creates a startup script at /usr/libexec/ollama/register-custom-models.sh but does not explain how this script is executed at boot time. The script needs to be integrated into the init system (systemd or SysV init) or called by the Ollama service startup. Please add documentation on how to ensure this script runs automatically after Ollama starts, for example by adding it to a systemd service as an ExecStartPost command or by integrating it into the Buildroot package configuration.

Copilot uses AI. Check for mistakes.

content/docs/developer/guide/private-model-upload.md Outdated

    
              ```bash

              # Package model files on the host

              tar -czvf my-model.tar.gz /path/to/model/files

Copilot AI Feb 26, 2026

The tar command creates an archive from /path/to/model/files, but this path is just a placeholder. The command should clarify whether this refers to a directory or a file. If it's a directory of model files, the command should use tar -czvf my-model.tar.gz -C /path/to/model files/ or tar -czvf my-model.tar.gz /path/to/model/files/ (with trailing slash). If it's meant to tar multiple files, consider showing a more explicit example like tar -czvf my-model.tar.gz weights.gguf config.json to avoid confusion.

Suggested change

      
            tar -czvf my-model.tar.gz /path/to/model/files
          
            tar -czvf my-model.tar.gz /path/to/model/files/

Copilot uses AI. Check for mistakes.

content/docs/developer/guide/private-model-upload.md

    
              ```bash

              BR2_PACKAGE_VLLM_MODEL="meta-llama/Llama-2-7b-hf"

              BR2_PACKAGE_VLLM_GPU_MEMORY="0.90"

Copilot AI Feb 26, 2026

The Buildroot configuration variable name BR2_PACKAGE_VLLM_GPU_MEMORY seems inconsistent with the runtime environment variable VLLM_GPU_MEMORY_UTILIZATION used in line 199. If these represent the same configuration parameter, consider using consistent naming (e.g., BR2_PACKAGE_VLLM_GPU_MEMORY_UTILIZATION) to make the relationship clearer. Verify that the Buildroot package correctly maps this configuration to the runtime environment variable.

Suggested change

      
            BR2_PACKAGE_VLLM_GPU_MEMORY="0.90"
          
            BR2_PACKAGE_VLLM_GPU_MEMORY_UTILIZATION="0.90"

Copilot uses AI. Check for mistakes.

content/docs/developer/guide/private-model-upload.md

    
                - pip install vllm

                - mkdir -p /var/lib/vllm/models

                # Download from a private HuggingFace registry (requires token for gated models)

                - |

Copilot AI Feb 26, 2026

The hardcoded token "your-token-here" should include a warning or note that this is a placeholder and should not be committed to version control or exposed in production configurations. Consider adding a comment or note that recommends using environment variables or secure secret management for tokens in actual deployments.

Suggested change

      
              - |
          
              - |
          
                # NOTE: This is a placeholder token for documentation purposes only.
          
                # Do NOT hardcode real tokens in scripts or configs; in production,
          
                # pass HF_TOKEN via environment variables or a secure secret manager.

Copilot uses AI. Check for mistakes.

content/docs/developer/guide/private-model-upload.md Outdated

    
              ```

              :::note

              For Ubuntu cloud-init CVMs, the default SSH user is `ultraviolet` (password: `password`). For Buildroot CVMs, the default user is `root`.

Copilot AI Feb 26, 2026

The default password "password" is documented here for development/testing purposes. While this is acceptable for documentation, consider adding a security warning that this default password should be changed in production environments. This is particularly important since the password is exposed in plaintext and the SSH port is forwarded to the host.

Suggested change

      
            For Ubuntu cloud-init CVMs, the default SSH user is `ultraviolet` (password: `password`). For Buildroot CVMs, the default user is `root`.
          
            For Ubuntu cloud-init CVMs, the default SSH user is `ultraviolet` (password: `password`). For Buildroot CVMs, the default user is `root`. The `password` credential is intended **only** for local development and testing; on any production or network-accessible deployment, you **must** change this password immediately (or disable password login and use SSH keys), especially since the SSH port is forwarded from the host to the CVM.

Copilot uses AI. Check for mistakes.

content/docs/developer/guide/private-model-upload.md

    
              ## 1. Package Model Files

              ### Port Reference

              CVM network access uses QEMU user-mode port forwarding. The following host-to-guest port mappings are configured in the QEMU launch scripts (`hal/buildroot/qemu.sh` and `hal/ubuntu/qemu.sh`):

Copilot AI Feb 26, 2026

The path references hal/buildroot/qemu.sh and hal/ubuntu/qemu.sh here, but the existing CVM Management documentation (cvm-management.md) consistently refers to hal/cloud/qemu.sh. Please verify that these paths are correct and consistent with the actual repository structure. If the directory structure has changed, the CVM Management documentation should also be updated for consistency.

Suggested change

      
            CVM network access uses QEMU user-mode port forwarding. The following host-to-guest port mappings are configured in the QEMU launch scripts (`hal/buildroot/qemu.sh` and `hal/ubuntu/qemu.sh`):
          
            CVM network access uses QEMU user-mode port forwarding. The following host-to-guest port mappings are configured in the QEMU launch script (`hal/cloud/qemu.sh`):

Copilot uses AI. Check for mistakes.

WashingtonKK force-pushed the cd-29 branch from 3f62232 to cb2f230 Compare

March 2, 2026 08:31

WashingtonKK added 4 commits

March 6, 2026 11:23


          CD-29 - Add deploying custom models docs

f0e65c9


          Fix markdownlint in private model upload docs

ab36606

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>


          update doc

e5336b6

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>


          add buildroot images

5aa8cca

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

WashingtonKK force-pushed the cd-29 branch from deab8c3 to 5aa8cca Compare

March 6, 2026 08:24


          Update HAL and private model upload documentation for clarity and acc…

9d9701d

…uracy

Signed-off-by: WashingtonKK <washingtonkigan@gmail.com>

SammyOina requested a review from fbugarski

March 25, 2026 14:37

fbugarski approved these changes

View reviewed changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet