Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
d719660
Add terragrunt unit to create VMs for a MAAS environment
freyes Sep 2, 2025
a8d1863
Add private directory.
freyes Sep 10, 2025
ce16536
gitignore: add .terragrunt-cache
freyes Sep 10, 2025
d4bd1f8
Generate maas provider configuration.
freyes Sep 10, 2025
55790bb
virtualnodes: move output definitions to their own file
freyes Sep 10, 2025
4180db0
virtualnodes: make network definitions to autostart
freyes Sep 10, 2025
532cb5b
virtualnodes: move maas controller hostname to a variable
freyes Sep 10, 2025
de31227
virtualnodes: add description to libvirt_uri variable
freyes Sep 10, 2025
4fedc19
virtualnodes: install maas from 3.6/stable instead of edge
freyes Sep 10, 2025
2974a8b
virtualnodes: comment out maas configuration in cloud-init
freyes Sep 10, 2025
4b935f5
maas: setup maas
freyes Sep 10, 2025
4107772
Add stack.hcl to put together the units in a single stack
freyes Sep 10, 2025
5ec091f
maas: mock the MAAS API Key in a compatible format
freyes Sep 10, 2025
c34f894
maas: import fabric, vlan and subnet for the generic network
freyes Sep 10, 2025
7729c13
maas: Configure DHCP in generic subnet
freyes Sep 11, 2025
1fa8ad1
Add helper scripts
freyes Sep 11, 2025
180cb5b
Move snap testing script to a standalone file
freyes Sep 11, 2025
483e747
Add functional-test-maas to build-snap workflow
freyes Sep 11, 2025
34d7a1e
testing: Validate number of arguments passed to test-standalone.sh
freyes Oct 3, 2025
c355bab
testing: Add test-multinode-maas.sh
freyes Oct 3, 2025
362a864
Move testlinger job definition to its own file
freyes Oct 3, 2025
254a9e3
install_deps.sh: fix terragrunt permissions
freyes Oct 3, 2025
f16999d
Make decompression verbose
freyes Oct 3, 2025
2a2f98b
Add README.md for testing/
freyes Oct 17, 2025
1c50496
testing: change the CWD to where the deploy.sh is.
freyes Oct 17, 2025
6982cda
testing: Install terraform
freyes Oct 17, 2025
0965ccf
Run sosreport to collect logs
freyes Oct 17, 2025
0b46aa4
Retry `apt-get update`
freyes Oct 17, 2025
4d991f4
Generate a password-less ssh key
freyes Oct 17, 2025
26b6f29
Run deploy.sh from a new session
freyes Oct 17, 2025
d03a67b
Set TEST_* env variables before kicking off the testing script
freyes Oct 17, 2025
21b9aaa
Call the collect-logs.sh script on failure
freyes Oct 17, 2025
0b539f4
debug: keep the job in a loop on failure
freyes Oct 17, 2025
f29031a
install_deps.sh: install genisoimage
freyes Oct 17, 2025
ff389ab
Register virtualnodes in MAAS
freyes Oct 17, 2025
07f0460
Add local-testflinger.sh
freyes Oct 17, 2025
87d552f
virtualnodes: Use a dedicated libvirt pool
freyes Oct 17, 2025
4408281
virtualnodes: block until the maas api key is available
freyes Oct 17, 2025
edd4f64
testing: Fix space mapping
freyes Oct 17, 2025
e0fa1a8
Set parallelism=1 for the maas unit
freyes Oct 20, 2025
747ab90
Update README.md
freyes Oct 20, 2025
ba3e6f2
Update virtualnodes default hardware specs
freyes Dec 17, 2025
03c04c3
Drop -parallelism=1
freyes Dec 17, 2025
e12eef5
testing: Add reserved range to maas
freyes Dec 17, 2025
6e54462
virtualnodes: rewrote to make it compatible with latest provider
freyes Dec 17, 2025
353c716
testing: make artifacts dir relative to the collect-logs.sh script
freyes Dec 17, 2025
aa4b928
testing: Block until port 22 is open on the testing instance
freyes Dec 17, 2025
a83ca56
testing: bump up output timeout to 180m
freyes Dec 17, 2025
bda71a2
testing: fix formatting in netplan template
freyes Dec 18, 2025
fdb4655
testing: Tag MAAS instances
freyes Mar 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
130 changes: 44 additions & 86 deletions .github/workflows/build-snap.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,92 +44,7 @@ jobs:
name: local-${{ needs.build.outputs.snap }}
- name: test
run: |
set -x
export COLUMNS=256

# Check docker, containerd and remove them if exists
sudo apt remove --purge docker.io containerd runc -y
sudo rm -rf /run/containerd

# Allow lxd controller to reach to k8s controller on loadbalancer ip
# sudo nft insert rule ip filter FORWARD tcp dport 17070 accept
# sudo nft insert rule ip filter FORWARD tcp sport 17070 accept
# With above rules, got the following error:
# api.charmhub.io on 10.152.183.182:53: server misbehaving
# Accept all packets filtered for forward
sudo nft chain ip filter FORWARD '{policy accept;}'

sudo snap remove --purge lxd
sudo snap install --channel 3.6 juju

sudo snap install ${{ needs.build.outputs.snap }} --dangerous
sudo snap connect openstack:juju-bin juju:juju-bin
openstack.sunbeam prepare-node-script --bootstrap | bash -x
sudo snap connect openstack:dot-local-share-juju
sudo snap connect openstack:dot-config-openstack
sudo snap connect openstack:dot-local-share-openstack

# Even though `--topology single --database single` is not used in the
# single-node tutorial, explicitly speficy it here to force the single
# mysql mode.
# The tutorial assumes ~16 GiB of memory where Sunbeam selects the singe
# mysql single mysql mode automatically. And self-hosted runners may
# have more than 32 GiB of memory where Sunbeam selects the multi mysql
# mode instead. So we have to override the Sunbeam's decision to be
# closer to the tutorial scenario.
sg snap_daemon "openstack.sunbeam cluster bootstrap --manifest .github/assets/testing/edge.yml --accept-defaults --topology single --database single"
sg snap_daemon "openstack.sunbeam cluster list"
# Note: Moving configure before enabling caas just to ensure caas images are not downloaded
# To download caas image, require ports to open on firewall to access fedora images.
sg snap_daemon "openstack.sunbeam configure --accept-defaults --openrc demo-openrc"
sg snap_daemon "openstack.sunbeam launch --name test"
# The cloud-init process inside the VM takes ~2 minutes to bring up the
# SSH service after the VM gets ACTIVE in OpenStack
sleep 300
source demo-openrc
openstack console log show --lines 200 test
demo_floating_ip="$(openstack floating ip list -c 'Floating IP Address' -f value | head -n1)"
ssh -oStrictHostKeyChecking=no -oUserKnownHostsFile=/dev/null -i ~/snap/openstack/current/sunbeam "ubuntu@${demo_floating_ip}" true

sg snap_daemon "openstack.sunbeam enable orchestration"
sg snap_daemon "openstack.sunbeam enable loadbalancer"
sg snap_daemon "openstack.sunbeam enable dns testing.github."
# Disabled until https://github.com/canonical/mysql-router-k8s-operator/issues/452
# or corresponding juju bug is fixed
# sg snap_daemon "openstack.sunbeam disable dns"
# sg snap_daemon "openstack.sunbeam disable loadbalancer"
# sg snap_daemon "openstack.sunbeam disable orchestration"

# Vault has storage requirements > 15G
# Commenting as CI servers might not have enough disk space
# sg snap_daemon "openstack.sunbeam enable vault --dev-mode"
# sg snap_daemon "openstack.sunbeam enable secrets"
# sg snap_daemon "openstack.sunbeam disable secrets"
# sg snap_daemon "openstack.sunbeam disable vault"

# Disable caas temporarily while MySQL memory gets adjusted
# sg snap_daemon "openstack.sunbeam enable caas"
# sg snap_daemon "openstack.sunbeam enable validation"
# If smoke tests fails, logs should be collected via sunbeam command in "Collect logs"
# sg snap_daemon "openstack.sunbeam validation run smoke"
# sg snap_daemon "openstack.sunbeam validation run --output tempest_validation.log"
# sg snap_daemon "openstack.sunbeam disable caas"
# sg snap_daemon "openstack.sunbeam disable validation"

sg snap_daemon "openstack.sunbeam enable telemetry"
# Commenting observability as storage requirements ~6G
# sg snap_daemon "openstack.sunbeam enable observability embedded"
# Commented disabling observability due to LP#1998282
# sg snap_daemon "openstack.sunbeam disable observability embedded"
# sg snap_daemon "openstack.sunbeam disable telemetry"

# Commenting features as storage is full in CI machines
# sg snap_daemon "openstack.sunbeam enable resource-optimization"
# sg snap_daemon "openstack.sunbeam enable instance-recovery"
# Disable IR as the consul pods are stuck in getting terminated
# sg snap_daemon "openstack.sunbeam disable instance-recovery"
# sg snap_daemon "openstack.sunbeam disable resource-optimization"

./testing/test-standalone.sh local-${{ needs.build.outputs.snap }}
- name: Collect logs
if: always()
run: |
Expand Down Expand Up @@ -167,3 +82,46 @@ jobs:
- name: Setup tmate session
if: ${{ failure() && runner.debug }}
uses: canonical/action-tmate@main
functional-test-maas:
needs: build
name: Functional test on MAAS
runs-on: [self-hosted, self-hosted-linux-amd64-noble-private-endpoint-medium]
env:
TESTFLINGER_DIR: .github/workflows/testflinger
steps:
- name: Checkout
uses: actions/checkout@v5
with:
path: repository
- name: Download snap artifact
id: download
uses: actions/download-artifact@v5
with:
name: local-${{ needs.build.outputs.snap }}
path: repository
- name: Pack the repository
run: |
tar acf repository.tar.gz repository/
- name: Create Testflinger job
env:
OPENSTACK_SNAP_PATH: local-${{ needs.build.outputs.snap }}
run: |
# Prepare job
envsubst '$OPENSTACK_SNAP_PATH' \
< $TESTFLINGER_DIR/job.yaml.tpl \
> $TESTFLINGER_DIR/job.yaml
- name: Submit job
uses: canonical/testflinger/.github/actions/submit@main
with:
poll: true
job-path: ${{ env.TESTFLINGER_DIR }}/job.yaml
- name: Upload logs
if: always()
uses: actions/upload-artifact@v4
with:
name: sunbeam_logs
path: logs
retention-days: 30
- name: Setup tmate session
if: ${{ failure() && runner.debug }}
uses: canonical/action-tmate@main
77 changes: 77 additions & 0 deletions .github/workflows/testflinger/job.yaml.tpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# -*- mode: yaml -*-
job_queue: openstack
provision_data:
distro: noble
global_timeout: 14400 # 4 hours
output_timeout: 10800 # 180 min
test_data:
attachments:
- local: repository.tar.gz
test_cmds: |
set -ex
scp ./attachments/test/repository.tar.gz "ubuntu@${DEVICE_IP}:"
if ssh "ubuntu@${DEVICE_IP}" '
set -ex
ssh-import-id lp:freyes
timeout_loop () {
local TIMEOUT=90
while [ "$TIMEOUT" -gt 0 ]; do
if "$@" > /dev/null 2>&1; then
echo "OK"
return 0
fi
TIMEOUT=$((TIMEOUT - 1))
sleep 1
done
echo "ERROR: $* FAILED"
ret=1
return 1
}
# http://pad.lv/2093303
sudo mv -v /etc/apt/sources.list{,.bak}
# Workaround for:
# E: Failed to fetch http://... Hash Sum mismatch
timeout_loop sudo apt-get update -q

# include ~/.local/bin in PATH
source ~/.profile
set -o pipefail
# LP: #2097451
# LP: #2102175
tar xzvf repository.tar.gz
cd repository/testing/

# generate passwordless key if needed
test -f ~/.ssh/passwordless || ssh-keygen -b 2048 -t rsa -f ~/.ssh/passwordless -q -N ""

# Allow ssh connections to the virtual nodes without having host fingerprint issues.
echo "Host 172.16.1.* 172.16.2.*" >> ~/.ssh/config
echo " UserKnownHostsFile /dev/null" >> ~/.ssh/config
echo " StrictHostKeyChecking no" >> ~/.ssh/config

# Install depependencies in the hypervisor.
./install_deps.sh

# Prepare the testing bed running terragrunt
# make the libvirt group effective in this shell, so terraform can talk to the libvirt unix socket
sudo su - ubuntu -c $(realpath ./deploy.sh)
cd ../

# Start the testing using the previously prepare test bed.
export TEST_SNAP_OPENSTACK=${OPENSTACK_SNAP_PATH}
export TEST_MAAS_API_KEY="$(cat /tmp/maas-api.key)"
export TEST_MAAS_URL="http://172.16.1.2:5240/MAAS"
./testing/test-multinode-maas.sh ${OPENSTACK_SNAP_PATH}
'; then
scp -r "ubuntu@${DEVICE_IP}:repository/artifacts/" artifacts/ || true
find artifacts/
else
echo "blocking until file /tmp/.continue shows up in ${DEVICE_IP}"
echo ssh ubuntu@${DEVICE_IP}
ssh ubuntu@${DEVICE_IP} "until test -f /tmp/.continue; do sleep 10;done"

ssh ubuntu@${DEVICE_IP} /home/ubuntu/repository/testing/collect-logs.sh
scp -r "ubuntu@${DEVICE_IP}:repository/artifacts/" artifacts/ || true
find artifacts/
exit 1
fi
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,5 @@ cython_debug/
.vscode/

.stestr/

.github/workflows/testflinger/*.yaml
5 changes: 5 additions & 0 deletions testing/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
*.tfstate
*.tfstate.backup
*.lock.hcl
.terraform/
.terragrunt-cache/
33 changes: 33 additions & 0 deletions testing/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Testflinger Testing

## Local Development/Testing

To run the testing, it's possible to use the `./testing/local-testflinger.sh` script.

Usage example:

1. Install the testflinger-cli snap: `sudo snap install testflinger-cli`.
2. Make sure there is a copy of the openstack snap at the toplevel of the git
repo. Use `snap download` or `snapcraft pack`.

``` sh
snap download --channel 2024.1/edge openstack
```

``` sh
snapcraft pack --use-lxd
```

3. Run `./testing/local-testflinger.sh`.

## TODO

* [ ] Expose a knob to turn on/off the log level of terragrunt/terraform.
* [ ] Redirect libvirt instances' console to a log file


## Known Issues

* When a libvirt instance does PXE boot, there could be situations where it
doesn't boot and it just times out, making the whole deployment timeout or
fail when terraform's apply times out.
13 changes: 13 additions & 0 deletions testing/collect-logs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/bash -ux

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )
ARTIFACTS_DIR=$(realpath $SCRIPT_DIR/../artifacts)
mkdir -p $ARTIFACTS_DIR

## Collect relevant files (if possible)
sudo mkdir /tmp/sosreport/
sudo sosreport -a --batch --label hypervisor --all-logs --tmp-dir=/tmp/sosreport/
sudo mv /tmp/sosreport/* $ARTIFACTS_DIR
ssh -i ~/.ssh/passwordless ubuntu@172.16.1.2 "sudo mkdir /tmp/sosreport; sudo sosreport -a --batch --label maas-controller --all-logs --tmp-dir=/tmp/sosreport/; sudo chmod +r /tmp/sosreport/"
scp -i ~/.ssh/passwordless ubuntu@172.16.1.2:"/tmp/sosreport/*" $ARTIFACTS_DIR
sudo chmod +r -R $ARTIFACTS_DIR
9 changes: 9 additions & 0 deletions testing/deploy.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/bin/bash -exu

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

pushd $SCRIPT_DIR

# export TERRAGRUNT_LOG_LEVEL=trace
# export TF_LOG=TRACE
terragrunt --non-interactive run-all apply
52 changes: 52 additions & 0 deletions testing/install_deps.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/bin/bash -x

if [ "x$(which terragrunt)" != "x0" ]; then
sudo wget -O /usr/local/bin/terragrunt https://github.com/gruntwork-io/terragrunt/releases/download/v0.87.1/terragrunt_linux_amd64
sudo chmod +x /usr/local/bin/terragrunt
fi

# install opentofu
if [ "x$(which tofu)" != "x0" ]; then
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://get.opentofu.org/opentofu.gpg | sudo tee /etc/apt/keyrings/opentofu.gpg >/dev/null
curl -fsSL https://packages.opentofu.org/opentofu/tofu/gpgkey | sudo gpg --no-tty --batch --dearmor -o /etc/apt/keyrings/opentofu-repo.gpg >/dev/null
sudo chmod a+r /etc/apt/keyrings/opentofu.gpg /etc/apt/keyrings/opentofu-repo.gpg
echo \
"deb [signed-by=/etc/apt/keyrings/opentofu.gpg,/etc/apt/keyrings/opentofu-repo.gpg] https://packages.opentofu.org/opentofu/tofu/any/ any main
deb-src [signed-by=/etc/apt/keyrings/opentofu.gpg,/etc/apt/keyrings/opentofu-repo.gpg] https://packages.opentofu.org/opentofu/tofu/any/ any main" | \
sudo tee /etc/apt/sources.list.d/opentofu.list > /dev/null

sudo chmod a+r /etc/apt/sources.list.d/opentofu.list
sudo apt-get update
sudo apt-get install -y -qq tofu
fi

# install terraform
if [ "x$(which tofu)" != "x0" ]; then
sudo apt-get update -q
sudo apt-get install -yq gnupg software-properties-common
wget -O- https://apt.releases.hashicorp.com/gpg | \
gpg --dearmor | \
sudo tee /usr/share/keyrings/hashicorp-archive-keyring.gpg > /dev/null
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(grep -oP '(?<=UBUNTU_CODENAME=).*' /etc/os-release || lsb_release -cs) main" | \
sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt-get update -q
sudo apt-get install -yq terraform
fi

if [ "x$(which virsh)" != "x0" ]; then
sudo apt-get install -y -qq \
sosreport \
libvirt-daemon \
libvirt-daemon-driver-qemu \
libvirt-daemon-system \
libvirt-clients \
genisoimage

# allow a non-root user to use libvirt/virsh easily with no permission issues
sudo sed -i '/^security_driver/d' /etc/libvirt/qemu.conf
echo 'security_driver = "none"' | sudo tee -a /etc/libvirt/qemu.conf >/dev/null
sudo systemctl restart libvirtd
fi
21 changes: 21 additions & 0 deletions testing/local-testflinger.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash -ex

TMP_DIR=$(mktemp -d)
trap "rm -rf $TMP_DIR; echo 'Cleaned up temporary file.'" EXIT
cp -rf ../snap-openstack/ $TMP_DIR/repository
pushd $TMP_DIR
tar --exclude=repository/.tox --exclude=repository/.github/workflows/testflinger/repository.tar.gz --exclude=repository/.git -acf repository.tar.gz repository/
ls -lh repository.tar.gz
popd
export TESTFLINGER_DIR=$(pwd)/.github/workflows/testflinger/
cp $TMP_DIR/repository.tar.gz $TESTFLINGER_DIR
export OPENSTACK_SNAP_PATH=$(ls openstack_*.snap)
JOB_FILE=$TESTFLINGER_DIR/job.yaml
envsubst '$OPENSTACK_SNAP_PATH' \
< $TESTFLINGER_DIR/job.yaml.tpl \
> $JOB_FILE

test -f $JOB_FILE
cd $TESTFLINGER_DIR
testflinger-cli -d submit --poll $JOB_FILE
rm -rf $TMP_DIR
Empty file added testing/private/.keep
Empty file.
1 change: 1 addition & 0 deletions testing/private/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Directory to write temporary files that are private in nature, for example generated password-less ssh keys.
20 changes: 20 additions & 0 deletions testing/stack.hcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
locals {
units = {
virtualnodes = {
source = "./virtualnodes"
}

maas = {
source = "./maas"
dependencies = ["virtualnodes"]
}
}

# Stack-wide variables
stack_config = {
ssh_private_key_path = "~/.ssh/passwordless"
ssh_public_key_path = "~/.ssh/passwordless.pub"
libvirt_uri = get_env("LIBVIRT_DEFAULT_URI", "qemu:///system")
maas_hostname = "maas-controller"
}
}
Loading