Skip to content

unixPB: Ensure en_US.UTF-8 locale is generated on CentOS for build images#4275

Open
sxa wants to merge 1 commit intoadoptium:masterfrom
sxa:local_enUS
Open

unixPB: Ensure en_US.UTF-8 locale is generated on CentOS for build images#4275
sxa wants to merge 1 commit intoadoptium:masterfrom
sxa:local_enUS

Conversation

@sxa
Copy link
Member

@sxa sxa commented Mar 2, 2026

This is a potential fix for #3576 and resolves a difference between the x64 image and the aarch64 one. Note that with this change the configure error is suppressed whenLC_ALL=en_US.utf8 is in the environment (although a warning is generated when you set that variable):

bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8): No such file or directory
bash: warning: setlocale: LC_ALL: cannot change locale (en_US.utf8): No such file or directory
Checklist

…ages

Signed-off-by: Stewart X Addison <sxa@ibm.com>
# Skipping linting as locale_gen module isn't usable on CentOS6/7
# https://github.com/ansible/ansible/issues/44708
- name: Create US UTF-8 locale
shell: localedef -i en_US -c -f UTF-8 en_US.UTF-8
Copy link
Contributor

@andrew-m-leonard andrew-m-leonard Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note the OpenJDK make first checks for C.utf8 (ref: https://github.com/adoptium/jdk25u/blob/30a962c1ab3ef19159c3ea2179602289e7e6f3ac/make/autoconf/basic.m4#L139)
and then drops back to en_US.UTF-8

Should we alias to "C.utf8" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we alias to "C.utf8" ?

If that's possible then sure :-)

Also bear in mind that C.utf8 is not in the locales for the Linux/x64 image and we should probably strive for compatibility one way or another, and so we should be careful with testing if we're going to change such a default everywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jdk-25+ will need C.UTF-8, so I think we need to discuss upgrading to Centos 8|9, at least for jdk-25+ ?
@steelhead31 fyi

@sxa sxa changed the title unixPB: Ensure en_US.UTF-8 locale is generated on CentOS for build ages unixPB: Ensure en_US.UTF-8 locale is generated on CentOS for build images Mar 2, 2026
@andrew-m-leonard
Copy link
Contributor

Reproducibility across OS distributions maybe a consideration here? see #4284

@andrew-m-leonard
Copy link
Contributor

andrew-m-leonard commented Mar 6, 2026

From performing reproducible build tests this week, i've confirmed for jdk-23+ builds we need to be building using the OpenJDK recommended locale C.UTF-8. This is because it is "language neutral", which means it should then identically reproduce Japanese resource files and the like across OS distributions.

So I suggest we ensure the playbooks gen C.UTF-8 and en_US.UTF-8

I think ultimately this probably means we need to upgrade to Centos 8|9 ?

@sxa
Copy link
Member Author

sxa commented Mar 6, 2026

I think ultimately this probably means we need to upgrade to Centos 8|9 ?

As I've said elsewhere I'm generally in favour of this irrespective of this particular problem (the only negative having to have multiple build images), but I would be choosing at least 9 and preferably 10.

We could also consider using the later GCC and devkits with earlier Temurin releases in order to standardise on a newer base image (Proposed in adoptium/temurin-build#4209). As per some of the comments in there jdk8u puts up a little bit of a fight but we might still be able to make it work.

We could also use this as an opportunity to standardise on always pulling the Linux images from ghcr.io instead of dockerhub.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants