Skip to content

HDFS-17899. Handle InvalidEncryptionKeyException in Balancer Dispatcher, SPS BlockDispatcher and DataNode DataTransfer#8383

Open
JHSUYU wants to merge 1 commit intoapache:trunkfrom
JHSUYU:HDFS-16136-group1-dispatcher
Open

HDFS-17899. Handle InvalidEncryptionKeyException in Balancer Dispatcher, SPS BlockDispatcher and DataNode DataTransfer#8383
JHSUYU wants to merge 1 commit intoapache:trunkfrom
JHSUYU:HDFS-16136-group1-dispatcher

Conversation

@JHSUYU
Copy link
Copy Markdown
Contributor

@JHSUYU JHSUYU commented Mar 26, 2026

JIRA: HDFS-17899

Summary

This is a follow-up of HDFS-17897.

HDFS-17897 fixed InvalidEncryptionKeyException handling in
DFSClient read/write and striped file checksum paths. However, three other
code paths that establish SASL-encrypted connections still lack this handling:

  1. Dispatcher.PendingMove.dispatch() — Balancer block moves
  2. BlockDispatcher.moveBlock() — SPS block moves
  3. DataNode.DataTransfer.run() — DataNode block replication

When dfs.encrypt.data.transfer=true and block keys rotate, these paths fail
with InvalidEncryptionKeyException and the stale key stays cached, causing
all subsequent transfers to fail until process restart.

Fix: Add the same retry pattern to all three paths — catch the exception,
clear the cached encryption key via a new clearDataEncryptionKey() default
method on DataEncryptionKeyFactory, and retry once with a fresh key.

Test

  • TestKeyManager#testClearDataEncryptionKey — verifies
    KeyManager.clearDataEncryptionKey() clears cached key
  • TestDispatcherEncryptionKey#testClearEncryptionKeyOnRetry — verifies
    Balancer Dispatcher retry
  • TestBlockDispatcher#testClearEncryptionKeyOnRetry — verifies SPS
    BlockDispatcher retry
  • TestDataTransferEncryptionKey#testClearEncryptionKeyOnRetry — verifies
    DataNode DataTransfer retry

…er, SPS BlockDispatcher and DataNode DataTransfer
@JHSUYU JHSUYU marked this pull request as draft March 27, 2026 19:08
@JHSUYU JHSUYU marked this pull request as ready for review March 27, 2026 19:09
@JHSUYU
Copy link
Copy Markdown
Contributor Author

JHSUYU commented Mar 30, 2026

Hi @jojochuang, I have 7 places to fix in total, and I've grouped 3 of them into this PR. Please let me know if these fixes make sense to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant