Hi,
During the further installation of TrinityX I came across some issues that I wanted to let you know. If you would prefer these issues to be listed as separate issues, let me know.
1. Multiple controllers using SlurmctldHost
Currently, in a 2 controller setup Slurm is configured like:
|
SlurmctldHost={{ slurm_ctrl_list }} |
SlurmctldHost=controller1,controller2
However, multiple controllers should be configured on its own line:
https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost
SlurmctldHost=controller1
SlurmctldHost=controller2
I found this issue during debugging an DNS issue and saw that there were queries done for the hostname: controller1,controller2.
2. chrony.conf.j2 word swap
When configuring allowed networks for chrony I came across another bug. I think the one of {{ net }}, {{ server }} variables should be swapped in this case.
|
{% for server in chrony_allow_networks %} |
{% if chrony_allow_networks %}
{% for server in chrony_allow_networks %}
allow {{ net }}
{% endfor %}
3. Docker images with slash (rockylinux/rockylinux:9.7)
When trying to install any recent Rocky Linux 9.3+ releases via docker we need to use the rockylinux/rockylinux images. The issue that there is a slash in the container name breaking the Ansible code because the folder doesn't exist and need to be created.
After creating /trinity/downloads/docker-image-rockylinux, the Rocky 9.7 does work apart from one small issue of installing the selinux-policy-minimum rpm that is not available anymore on RHEL/Rocky 9+.
|
shell: "docker export {{ docker_container_id.stdout }} > {{ trix_root }}/downloads/docker-image-{{ image_download_distribution }}.tar" |
4. Cannot change tuned profile using tuned-adm in lchroot
I found that setting the compute performance tuned profile fails in building the compute default images. At least for Rocky 9.3:
TASK [trinity/tunables : Setting compute performance tuned profile] **************************************************************************************************************************
2026-04-22 16:24:57,022 p=1192670 u=root n=ansible | fatal: [compute.osimages.luna]: FAILED! => {"changed": true, "cmd": "tuned-adm profile hpc-compute", "delta": "0:00:00.159230", "end": "2026-04-22 16:24:56.979824", "msg": "non-zero return code", "rc": 1, "start": "2026-04-22 16:24:56.820594", "stderr": "Cannot talk to TuneD daemon via DBus. Is TuneD daemon running?\nTuneD (re)start failed, check TuneD logs for details.\nUnable to switch profile.", "stderr_lines": ["Cannot talk to TuneD daemon via DBus. Is TuneD daemon running?", "TuneD (re)start failed, check TuneD logs for details.", "Unable to switch profile."], "stdout": "Trying to (re)start tuned...", "stdout_lines": ["Trying to (re)start tuned..."]}
Probably this could be solved with setting it using the file.
echo "hpc-compute" > /etc/tuned/active_profile
Thanks,
Rick
Hi,
During the further installation of TrinityX I came across some issues that I wanted to let you know. If you would prefer these issues to be listed as separate issues, let me know.
1. Multiple controllers using SlurmctldHost
Currently, in a 2 controller setup Slurm is configured like:
trinityX/site/roles/trinity/slurm/templates/slurm.conf.j2
Line 4 in dd23d18
However, multiple controllers should be configured on its own line:
https://slurm.schedmd.com/slurm.conf.html#OPT_SlurmctldHost
I found this issue during debugging an DNS issue and saw that there were queries done for the hostname:
controller1,controller2.2. chrony.conf.j2 word swap
When configuring allowed networks for chrony I came across another bug. I think the one of
{{ net }},{{ server }}variables should be swapped in this case.trinityX/site/roles/trinity/chrony/templates/chrony.conf.j2
Line 25 in dd23d18
3. Docker images with slash (rockylinux/rockylinux:9.7)
When trying to install any recent Rocky Linux 9.3+ releases via docker we need to use the rockylinux/rockylinux images. The issue that there is a slash in the container name breaking the Ansible code because the folder doesn't exist and need to be created.
After creating
/trinity/downloads/docker-image-rockylinux, the Rocky 9.7 does work apart from one small issue of installing theselinux-policy-minimumrpm that is not available anymore on RHEL/Rocky 9+.trinityX/site/roles/trinity/image-download/tasks/main.yml
Line 59 in dd23d18
4. Cannot change tuned profile using tuned-adm in lchroot
I found that setting the compute performance tuned profile fails in building the compute default images. At least for Rocky 9.3:
Probably this could be solved with setting it using the file.
Thanks,
Rick