Skip to content

Safer handling of env vars for the K8s PM#860

Open
PawelPlesniak wants to merge 10 commits intoprep-release/fddaq-v5.6.0from
PawelPlesniak/K8sPMHostMount
Open

Safer handling of env vars for the K8s PM#860
PawelPlesniak wants to merge 10 commits intoprep-release/fddaq-v5.6.0from
PawelPlesniak/K8sPMHostMount

Conversation

@PawelPlesniak
Copy link
Copy Markdown
Collaborator

@PawelPlesniak PawelPlesniak commented Mar 25, 2026

Description

Fixes issue #856

Handles the environment variables more agressively for code sanity and safety.
Updates the HugePages config that were used in HW testing on Wednesday

Type of change

  • New feature / enhancement
  • Optimization
  • Bug fix
  • Breaking change
  • Documentation

List of required branches from other repositories

N/A

Change log

The environment variable HOME was not being assigned to the pod. I am unsure of the source of the change given that there have not been many changes since the last HW tests, but I decided to make the requirements for the environment variables a lot more stringent, exiting if necessary and joining the logic of separated conditional statements.

HugePage configs were changed to match those that were seen using dpdk-hugepages.py -s.

Hostnames were being used incorrectly - in some configurations, the hostname has a suffix, e.g. np02-srv-005-1. To allow the configuration to have this flexibility, the hostname was formatted to always check for the -1 suffix and to remove it if it is present. Note, during testing, the suffix was still present with e.g.
image

However, to have a more complete k8s PM in time for this release, this shortcut was chosen. The definition of the host name does not have a standard or a check, but this is not within the scope of the run control, and as such will be deferred to a time when we can have this field standardized. Once that is complete, the hostname formatting command can be removed from this repository.

Suggested manual testing checklist

In production, I ran a BDE session successfully using the K8s PM. If this is being reviewed at a suitable time, the same test can be ran, otherwise we will have to rely on integration tests to continue the validation. This branch was used, and the tmux session that was used for it is still active - see np02-bde-k8s.

Developer checklist

Prior to marking this as "Ready for Review"

~~Tests ran on: WHAT HOSTNAME from release RELEASE_NAME ~~ Testing skipped as the only changes being made are to the K8s process manager. This is not the primary workflow for the current integraiton tests, and as integration testing has been skipped for the purposes of this PR.

Unit tests - some tests can't be ran on the CI. This is documented. If this PR checks a feature that can't be tested with CI, this has been marked appropriately.

Integration tests - the daqsystemtest_integtest_bundle requires a lot of resources, and connections to the EHN1 infrastructure. Check the cross referenced list if you can't run these. The developer needs to run at least the .

  • Unit tests (pytest --marker) passed
    • With relevant marker
    • Without marker
  • Integration tests passed
    • Only daqsystemtest_integtest_bundle.sh -k minimal_system_quick_test.py
    • Full daqsystemtest_integtest_bundle.sh
    • Testing skipped as k8s is not currently used in the integration tests and does not affect the primary workflow of the
  • Testing skipped as there are no core code changes in this PR, this only relates to documentation/CI workflows

Final checklist prior to marking this as "Ready for Review"

  • Code is clearly commented.
  • New unit tests have been added, or is documented in # ISSUE NUMBER
  • A suitable reviewer has been chosen from this list.

Reviewer checklist

  • This branch has been rebased with develop prior to testing.
  • Suggested manual tests show changes.
  • CI workflows fails documented (if present)
  • Integration tests passed
    • Only concern yourself if failures related to drunc are in the log files
    • If non-drunc failure appears:
      • Validate failure in fresh working area
      • Contact Pawel if unsure

Once the features are validated and both the unit and integration tests pass, the PRs is ready to be merged.

Prior to merging

Choose one of the following an complete all substeps
  • Changes only affect the Run Control, are in a single repository, and do not affect the end user.
    • Changes are documented in docstrings and code comments
    • Wiki has been updated if architectural or endpoint changes
  • Otherwise
    • Workflow changes demonstrated in the Change Log (if necessary)
    • Wiki has been updated (if necessary)
    • #daq-sw-librarians Slack channel notified (see below)

Once completed, the reviewer can merge the PR.

Notification message for #daq-sw-librarians Slack channel

For an single merge that changes the user workflow

The CCM WG has an isolated PR ready to merge that affects user workflows. The PR is:

_URL_

I will leave time for any comments, otherwise will merge these at the end of the work day _Insert your time zone_.

For co-ordinated merge

The CCM WG has a set of co-ordinated merges ready to merge. The PRs are:

_URL_

_URL_


I will leave time for any comments, otherwise will merge these at the end of the day.

@PawelPlesniak PawelPlesniak changed the base branch from develop to prep-release/fddaq-v5.6.0 March 27, 2026 15:34
@PawelPlesniak PawelPlesniak requested a review from wanyunSu March 27, 2026 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants