Encryption Key Rotation Support#2350
Encryption Key Rotation Support#2350alstanchev wants to merge 4 commits intoeclipse-ditto:masterfrom
Conversation
Implement dual-key configuration and fallback decryption to enable safe encryption key rotation without downtime or data loss. Signed-off-by: Aleksandar Stanchev <aleksandar.stanchev@bosch.com>
Signed-off-by: Aleksandar Stanchev <aleksandar.stanchev@bosch.com>
Signed-off-by: Aleksandar Stanchev <aleksandar.stanchev@bosch.com>
Signed-off-by: Aleksandar Stanchev <aleksandar.stanchev@bosch.com>
5abcf4d to
110624e
Compare
thjaeckle
left a comment
There was a problem hiding this comment.
PR Review: Encryption Key Rotation Support
Note: This review was supported with the help of an LLM (Claude Code).
+5522 / -48 | 33 files changed
Nice feature — the architecture is solid: ClusterSingleton migration actor, stream-based processing with throttling, progress persistence for resume, and abort support. The separation into DocumentProcessor, MigrationStreamFactory, MigrationProgressTracker, and MigrationContext is clean.
Below are my findings, ordered by severity.
Critical Issues
1. Actor state mutation from non-actor thread
In EncryptionMigrationActor.handleMigration(), the resume path mutates actor state directly inside a thenCompose callback, which runs on a CompletionStage thread, not the actor thread:
migrationResult = progressTracker.loadProgress().thenCompose(optProgress -> {
if (optProgress.isEmpty() || PHASE_COMPLETED.equals(optProgress.get().phase)) {
// ...
migrationInProgress = false; // ⚠️ UNSAFE: not on actor thread
currentProgress = completed; // ⚠️ UNSAFE: not on actor thread
sender.tell(...);
return CompletableFuture.completedFuture(completed);
}
// ...
});This violates the core Pekko actor concurrency rule. These state changes must be piped back to the actor via self.tell(), similar to how MigrationCompleted is already used for the normal completion path.
Design Concerns
2. Missing Helm chart updates
New HOCON configuration requires corresponding Helm chart updates. The following new config values have no Helm equivalents:
old-symmetrical-key/CONNECTIVITY_CONNECTION_OLD_ENCRYPTION_KEYmigration.batch-size/CONNECTIVITY_ENCRYPTION_MIGRATION_BATCH_SIZEmigration.max-documents-per-minute/CONNECTIVITY_ENCRYPTION_MIGRATION_MAX_DOCS_PER_MINUTE
3. Own MongoClient instance
EncryptionMigrationActor creates its own MongoClientWrapper rather than reusing the existing one from the service. This means an additional MongoDB connection pool per cluster node. Consider accepting a MongoClientWrapper as a constructor parameter instead.
4. Batch size default mismatch
FieldsEncryptionConfig.ConfigValue.MIGRATION_BATCH_SIZE defaults to 100 in code, but connectivity.conf overrides it to 1. The effective default is 1 document per batch, which is extremely conservative. The grouped(batchSize) in the stream means each MongoDB bulk write contains only 1 document — is this intentional?
5. MigrationProgress public fields
MigrationProgress exposes all fields as public final, which is atypical for the Ditto codebase. Using private fields with getters would be more consistent, or making it a record would be more idiomatic.
Behavioral Changes to Call Out
6. /credentials/cert added to encryption pointers
The diff adds "/credentials/cert" to the default json-pointers list in connectivity.conf. This is an independent behavioral change not mentioned in the PR description — existing connections with credentials/cert will now have that field encrypted on next snapshot write. Should this be in a separate commit or at least mentioned in the description?
This PR implements encryption key rotation for Ditto's connectivity service using a dual-key approach. The feature enables secure rotation
of AES-256-GCM encryption keys for sensitive connection data (credentials, URIs) without service interruption.
Key Features:
decryption
batches
Implements #2340