Skip to content

Small configuration issues (slurm, chrony, docker image) #537

@Rknegt

Description

@Rknegt

Hi,

During the further installation of TrinityX I came across some issues that I wanted to let you know. If you would prefer these issues to be listed as separate issues, let me know.

1. Multiple controllers using SlurmctldHost

Currently, in a 2 controller setup Slurm is configured like:

SlurmctldHost={{ slurm_ctrl_list }}

SlurmctldHost=controller1,controller2

However, multiple controllers should be configured on its own line:
https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost

SlurmctldHost=controller1
SlurmctldHost=controller2

I found this issue during debugging an DNS issue and saw that there were queries done for the hostname: controller1,controller2.

2. chrony.conf.j2 word swap

When configuring allowed networks for chrony I came across another bug. I think the one of {{ net }}, {{ server }} variables should be swapped in this case.

{% for server in chrony_allow_networks %}

{% if chrony_allow_networks %}
{% for server in chrony_allow_networks %}
allow {{ net }}
{% endfor %}

3. Docker images with slash (rockylinux/rockylinux:9.7)

When trying to install any recent Rocky Linux 9.3+ releases via docker we need to use the rockylinux/rockylinux images. The issue that there is a slash in the container name breaking the Ansible code because the folder doesn't exist and need to be created.

After creating /trinity/downloads/docker-image-rockylinux, the Rocky 9.7 does work apart from one small issue of installing the selinux-policy-minimum rpm that is not available anymore on RHEL/Rocky 9+.

shell: "docker export {{ docker_container_id.stdout }} > {{ trix_root }}/downloads/docker-image-{{ image_download_distribution }}.tar"

4. Cannot change tuned profile using tuned-adm in lchroot

I found that setting the compute performance tuned profile fails in building the compute default images. At least for Rocky 9.3:

TASK [trinity/tunables : Setting compute performance tuned profile] **************************************************************************************************************************
2026-04-22 16:24:57,022 p=1192670 u=root n=ansible | fatal: [compute.osimages.luna]: FAILED! => {"changed": true, "cmd": "tuned-adm profile hpc-compute", "delta": "0:00:00.159230", "end": "2026-04-22 16:24:56.979824", "msg": "non-zero return code", "rc": 1, "start": "2026-04-22 16:24:56.820594", "stderr": "Cannot talk to TuneD daemon via DBus. Is TuneD daemon running?\nTuneD (re)start failed, check TuneD logs for details.\nUnable to switch profile.", "stderr_lines": ["Cannot talk to TuneD daemon via DBus. Is TuneD daemon running?", "TuneD (re)start failed, check TuneD logs for details.", "Unable to switch profile."], "stdout": "Trying to (re)start tuned...", "stdout_lines": ["Trying to (re)start tuned..."]}

Probably this could be solved with setting it using the file.

echo "hpc-compute" > /etc/tuned/active_profile

Thanks,
Rick

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions