Skip to content

YARN-11954. ContainerLaunch should use system credentials for updated delegation tokens#8377

Open
JHSUYU wants to merge 1 commit intoapache:trunkfrom
JHSUYU:fix-container-launch-stale-credentials
Open

YARN-11954. ContainerLaunch should use system credentials for updated delegation tokens#8377
JHSUYU wants to merge 1 commit intoapache:trunkfrom
JHSUYU:fix-container-launch-stale-credentials

Conversation

@JHSUYU
Copy link
Copy Markdown
Contributor

@JHSUYU JHSUYU commented Mar 25, 2026

Jira: YARN-11954

Description

When the RM's DelegationTokenRenewer replaces a delegation token that has reached its max lifetime, the new token is pushed to NodeManagers through the heartbeat response as systemCredentials.

On the NM side, ResourceLocalizationService.writeCredentials() correctly checks NMContext.getSystemCredentialsForApps() and uses the updated token when writing the localizer's credential file.
However, ContainerLaunch does not — it writes the container's token file using only container.getCredentials(), which still holds the original (now invalid) token.

This means that for long-running applications (running beyond delegation.token.max-lifetime, default 7 days), newly launched or re-initialized containers may receive stale delegation tokens and fail with authentication errors when accessing external services such as HDFS.

Fix

Add getEffectiveCredentials() in ContainerLaunch that merges system credentials into the container credentials before writing the token file, mirroring the localizer's behavior.

How was this patch tested?

  • Added testContainerLaunchUsesSystemCredentials in TestContainerLaunch

@hadoop-yetus
Copy link
Copy Markdown

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 19m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 1 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 39s trunk passed
+1 💚 compile 1m 38s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 compile 1m 38s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 checkstyle 1m 8s trunk passed
+1 💚 mvnsite 1m 14s trunk passed
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 1m 8s trunk passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 1m 58s trunk passed
+1 💚 shadedclient 29m 3s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 9s the patch passed
+1 💚 compile 1m 6s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javac 1m 6s the patch passed
+1 💚 compile 1m 8s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 javac 1m 8s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 35s the patch passed
+1 💚 mvnsite 0m 41s the patch passed
+1 💚 javadoc 0m 35s the patch passed with JDK Ubuntu-21.0.10+7-Ubuntu-124.04
+1 💚 javadoc 0m 34s the patch passed with JDK Ubuntu-17.0.18+8-Ubuntu-124.04.1
+1 💚 spotbugs 1m 38s the patch passed
+1 💚 shadedclient 28m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 26m 54s hadoop-yarn-server-nodemanager in the patch passed.
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
165m 52s
Subsystem Report/Notes
Docker ClientAPI=1.54 ServerAPI=1.54 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8377/1/artifact/out/Dockerfile
GITHUB PR #8377
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux badcd596bab1 5.15.0-173-generic #183-Ubuntu SMP Fri Mar 6 13:29:34 UTC 2026 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c17d464
Default Java Ubuntu-17.0.18+8-Ubuntu-124.04.1
Multi-JDK versions /usr/lib/jvm/java-21-openjdk-amd64:Ubuntu-21.0.10+7-Ubuntu-124.04 /usr/lib/jvm/java-17-openjdk-amd64:Ubuntu-17.0.18+8-Ubuntu-124.04.1
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8377/1/testReport/
Max. process+thread count 612 (vs. ulimit of 10000)
modules C: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager U: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-8377/1/console
versions git=2.43.0 maven=3.9.11 spotbugs=4.9.7
Powered by Apache Yetus 0.14.1 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants