From f11b22d0a3a01d6c9d6c2a9991046e55dfca4452 Mon Sep 17 00:00:00 2001 From: strantalis Date: Wed, 18 Feb 2026 13:16:14 -0500 Subject: [PATCH 1/2] docs(spec): clarify segment integrity, GMAC tagging, and nonce derivation for AES-GCM --- concepts/security.md | 11 +++++++---- schema/OpenTDF/integrity_information.md | 10 ++++++---- schema/OpenTDF/json-schema/schema.json | 8 ++++++-- schema/OpenTDF/method.md | 5 ++++- 4 files changed, 23 insertions(+), 11 deletions(-) diff --git a/concepts/security.md b/concepts/security.md index 10f5720..8ab2a05 100644 --- a/concepts/security.md +++ b/concepts/security.md @@ -13,9 +13,12 @@ While encryption protects confidentiality, it doesn't inherently prevent undetec * **Purpose:** To allow recipients to verify that the encrypted payload has not been altered since its creation. This is especially critical for streamed data. * **Mechanism:** 1. **Segmentation:** The plaintext payload is processed in chunks (segments). - 2. **Segment Hashing/Tagging:** As each segment is encrypted (using AES-GCM, for example), a cryptographic integrity tag (like a GMAC) is generated for that encrypted segment using the *payload encryption key*. This tag is stored (as `hash`) in the corresponding [Segment Object](../schema/OpenTDF/integrity_information.md#encryptioninformationintegrityinformationsegment). - 3. **Root Signature:** All the individual segment tags/hashes are concatenated in order. A final HMAC (e.g., HMAC-SHA256) is calculated over this concatenated string of hashes, again using the *payload encryption key*. This result is stored as the `rootSignature.sig`. -* **Result:** Any modification to even a single bit of the encrypted payload will invalidate the integrity tag of the affected segment *and* consequently invalidate the final `rootSignature`. During decryption, the receiving client MUST verify the integrity tag of each segment and the overall `rootSignature`. Failure indicates tampering. + 2. **Segment Tagging (GMAC for AES-GCM):** For `method.algorithm = AES-256-GCM`, the segment `hash` is the AEAD authentication tag (GMAC) produced during encryption of that segment. It is computed with the payload key, the per-segment nonce, and any AAD, and stored as Base64 in the Segment Object. + 3. **Root Signature (HS256):** Concatenate the raw bytes of each segment tag in order (Base64-decode each `segments[i].hash` and concatenate). Compute `HMAC-SHA256` over that byte stream using the payload key, then Base64-encode the result as `rootSignature.sig`. + 4. **Nonce Requirement:** For streamable AES-GCM, each segment MUST use a unique nonce derived from `method.iv` as described in the Method Object. Reuse is catastrophic. +* **Result:** Any modification to even a single bit of the encrypted payload will invalidate the integrity tag of the affected segment *and* consequently invalidate the final `rootSignature`. During decryption, the receiving client MUST verify each segment's AEAD tag and the overall `rootSignature`. Failure indicates tampering. + +**Note on plaintext payloads:** `payload.isEncrypted=false` is reserved for future use; integrity rules for plaintext payloads are out of scope. ## 3. Policy Binding @@ -53,4 +56,4 @@ These mechanisms work together: * **Policy Binding** ensures the access policy cannot be decoupled from the key access grant for a specific KAS. * **Key Splitting** enforces multi-party authorization, preventing single points of failure or compromise for key access. -This layered approach provides robust, data-centric security and tamper evidence for data protected by OpenTDF. \ No newline at end of file +This layered approach provides robust, data-centric security and tamper evidence for data protected by OpenTDF. diff --git a/schema/OpenTDF/integrity_information.md b/schema/OpenTDF/integrity_information.md index e7cfb13..cf65969 100644 --- a/schema/OpenTDF/integrity_information.md +++ b/schema/OpenTDF/integrity_information.md @@ -23,8 +23,8 @@ The `integrityInformation` object, nested within [`encryptionInformation`](./enc | --------------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | | rootSignature | Object | Contains a cryptographic signature or HMAC over the combined integrity hashes of all segments, providing overall payload integrity. | Yes | | rootSignature.alg | String | Algorithm used for the rootSignature.sig. HS256 (HMAC-SHA256 using the payload key) is commonly used. | Yes | -| rootSignature.sig | String | The Base64 encoded signature or HMAC value. Calculated over the concatenation of all segment hashes/tags in order. E.g., Base64(HMAC-SHA256(PayloadKey, Concat(SegmentHash1, SegmentHash2, ...))). | Yes | -| segmentHashAlg | String | The algorithm used to generate the hash for each segment in the segments array. GMAC (using the AES-GCM payload key) is commonly used when method.algorithm is AES-256-GCM. | Yes | +| rootSignature.sig | String | The Base64 encoded signature or HMAC value. Calculated over the concatenation of all segment hash **bytes** in order (Base64-decode each `segments[i].hash`, concatenate, then HMAC). E.g., `Base64(HMAC-SHA256(PayloadKey, Concat(bytes(Hash1), bytes(Hash2), ...)))`. | Yes | +| segmentHashAlg | String | The algorithm used to generate the hash for each segment in the segments array. For `AES-256-GCM`, `GMAC` (the AEAD tag) is used. | Yes | | segments | Array | An array of [Segment Objects](#encryptionInformation.integrityInformation.segment), one for each chunk of the payload if method.isStreamable is true. Order MUST match payload order. | Yes | | segmentSizeDefault | Number | The default size (in bytes) of the plaintext payload segments. Allows omitting segmentSize in individual segment objects if they match this default. | Yes | | encryptedSegmentSizeDefault | Number | The default size (in bytes) of the encrypted payload segments (including any authentication tag overhead, like from AES-GCM). Allows omitting encryptedSegmentSize in segments. | | @@ -43,6 +43,8 @@ Object containing integrity information about a segment of the payload, includin |Parameter|Type|Description| |---|---|---| -|`hash`|String|A hash generated using the specified `segmentHashAlg`.

`Base64.encode(HMAC(segment, payloadKey))`| +|`hash`|String|A Base64-encoded authentication tag generated using the specified `segmentHashAlg`. For `GMAC`, this is the AES-GCM tag produced during encryption of the segment with the payload key, the per-segment nonce, and any AAD.| + +**Nonce derivation (AES-GCM, streamable):** Each segment must use a unique nonce derived from `method.iv` as specified in the Method Object. |`segmentSize`|Number|The size of the segment. This field is optional. The size of the segment is inferred from 'segmentSizeDefault' defined above, but in the event that a segment were modified and re-encrypted, the segment size would change.| -|`encryptedSegmentSize`|Number|The size of the segment (in bytes) after the payload segment has been encrypted.| \ No newline at end of file +|`encryptedSegmentSize`|Number|The size of the segment (in bytes) after the payload segment has been encrypted.| diff --git a/schema/OpenTDF/json-schema/schema.json b/schema/OpenTDF/json-schema/schema.json index b2929ba..8e8e894 100644 --- a/schema/OpenTDF/json-schema/schema.json +++ b/schema/OpenTDF/json-schema/schema.json @@ -108,9 +108,13 @@ "isStreamable": { "description": "Designates whether or not the payload is streamable.", "type": "boolean" + }, + "iv": { + "description": "Base64-encoded IV/nonce for the payload encryption algorithm. For AES-GCM, 12 bytes. Used as the base nonce for streamable encryption.", + "type": "string" } }, - "required": ["algorithm", "isStreamable"] + "required": ["algorithm", "isStreamable", "iv"] }, "integrityInformation": { "type": "object", @@ -133,7 +137,7 @@ "type": "number" }, "segmentHashAlg": { - "description": "Algorithm used to generate segment hashes", + "description": "Algorithm used to generate segment hashes (e.g., GMAC for AES-GCM AEAD tags)", "type": "string" }, "segments": { diff --git a/schema/OpenTDF/method.md b/schema/OpenTDF/method.md index 61737ca..c807dbd 100644 --- a/schema/OpenTDF/method.md +++ b/schema/OpenTDF/method.md @@ -18,4 +18,7 @@ The `method` object, nested within [`encryptionInformation`](./encryption_inform | ------------ | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | | algorithm | String | The symmetric encryption algorithm used. AES-256-GCM is the recommended and commonly implemented algorithm. | Yes | | isStreamable | Boolean | Indicates if the payload was encrypted in segments suitable for streaming decryption. If true, [integrityInformation](./integrity_information.md) MUST contain segment details. | Yes | -| iv | String | The Base64 encoded Initialization Vector (IV) used with the symmetric algorithm. MUST be unique for each TDF encrypted with the same key. For AES-GCM, typically 12 bytes (96 bits). | Yes | +| iv | String | The Base64 encoded Initialization Vector (IV) used with the symmetric algorithm. MUST be unique for each TDF encrypted with the same key. For AES-GCM, this MUST be 12 bytes (96 bits). | Yes | + +**Streamable AES-GCM nonce derivation:** When `isStreamable` is true and `algorithm` is `AES-256-GCM`, `iv` is the base 96-bit nonce for segment 0. For segment index `i` (starting at 0), derive the nonce as: +`nonce = iv[0..7] || uint32_be(iv[8..11] + i)`. Implementations MUST reject if the 32-bit counter overflows or wraps. From d147165bd4c7f7b17b9ba23dd41abb0402e59a1e Mon Sep 17 00:00:00 2001 From: strantalis Date: Wed, 18 Feb 2026 16:28:13 -0500 Subject: [PATCH 2/2] docs(spec): decouple nonce derivation from method.iv and clarify integrity terminology --- concepts/security.md | 2 +- schema/OpenTDF/integrity_information.md | 6 +++--- schema/OpenTDF/json-schema/schema.json | 6 +----- schema/OpenTDF/method.md | 5 +---- 4 files changed, 6 insertions(+), 13 deletions(-) diff --git a/concepts/security.md b/concepts/security.md index 8ab2a05..6cdb81e 100644 --- a/concepts/security.md +++ b/concepts/security.md @@ -15,7 +15,7 @@ While encryption protects confidentiality, it doesn't inherently prevent undetec 1. **Segmentation:** The plaintext payload is processed in chunks (segments). 2. **Segment Tagging (GMAC for AES-GCM):** For `method.algorithm = AES-256-GCM`, the segment `hash` is the AEAD authentication tag (GMAC) produced during encryption of that segment. It is computed with the payload key, the per-segment nonce, and any AAD, and stored as Base64 in the Segment Object. 3. **Root Signature (HS256):** Concatenate the raw bytes of each segment tag in order (Base64-decode each `segments[i].hash` and concatenate). Compute `HMAC-SHA256` over that byte stream using the payload key, then Base64-encode the result as `rootSignature.sig`. - 4. **Nonce Requirement:** For streamable AES-GCM, each segment MUST use a unique nonce derived from `method.iv` as described in the Method Object. Reuse is catastrophic. + 4. **Nonce Requirement:** For streamable AES-GCM, each segment MUST use a unique nonce. The derivation and encoding MUST be specified by the encryption method. Reuse is catastrophic. * **Result:** Any modification to even a single bit of the encrypted payload will invalidate the integrity tag of the affected segment *and* consequently invalidate the final `rootSignature`. During decryption, the receiving client MUST verify each segment's AEAD tag and the overall `rootSignature`. Failure indicates tampering. **Note on plaintext payloads:** `payload.isEncrypted=false` is reserved for future use; integrity rules for plaintext payloads are out of scope. diff --git a/schema/OpenTDF/integrity_information.md b/schema/OpenTDF/integrity_information.md index cf65969..0a576e9 100644 --- a/schema/OpenTDF/integrity_information.md +++ b/schema/OpenTDF/integrity_information.md @@ -21,9 +21,9 @@ The `integrityInformation` object, nested within [`encryptionInformation`](./enc | Parameter | Type | Description | Required? | | --------------------------- | ------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | -| rootSignature | Object | Contains a cryptographic signature or HMAC over the combined integrity hashes of all segments, providing overall payload integrity. | Yes | +| rootSignature | Object | Contains a cryptographic integrity value over the combined segment hashes/tags, providing overall payload integrity. For `alg=HS256`, this is an HMAC. | Yes | | rootSignature.alg | String | Algorithm used for the rootSignature.sig. HS256 (HMAC-SHA256 using the payload key) is commonly used. | Yes | -| rootSignature.sig | String | The Base64 encoded signature or HMAC value. Calculated over the concatenation of all segment hash **bytes** in order (Base64-decode each `segments[i].hash`, concatenate, then HMAC). E.g., `Base64(HMAC-SHA256(PayloadKey, Concat(bytes(Hash1), bytes(Hash2), ...)))`. | Yes | +| rootSignature.sig | String | The Base64-encoded integrity value. For `alg=HS256`, this is `Base64(HMAC-SHA256(PayloadKey, Concat(bytes(Hash1), bytes(Hash2), ...)))`, where `bytes(HashN)` are the Base64-decoded segment hashes in order. | Yes | | segmentHashAlg | String | The algorithm used to generate the hash for each segment in the segments array. For `AES-256-GCM`, `GMAC` (the AEAD tag) is used. | Yes | | segments | Array | An array of [Segment Objects](#encryptionInformation.integrityInformation.segment), one for each chunk of the payload if method.isStreamable is true. Order MUST match payload order. | Yes | | segmentSizeDefault | Number | The default size (in bytes) of the plaintext payload segments. Allows omitting segmentSize in individual segment objects if they match this default. | Yes | @@ -45,6 +45,6 @@ Object containing integrity information about a segment of the payload, includin |---|---|---| |`hash`|String|A Base64-encoded authentication tag generated using the specified `segmentHashAlg`. For `GMAC`, this is the AES-GCM tag produced during encryption of the segment with the payload key, the per-segment nonce, and any AAD.| -**Nonce derivation (AES-GCM, streamable):** Each segment must use a unique nonce derived from `method.iv` as specified in the Method Object. +**Nonce derivation (AES-GCM, streamable):** Each segment must use a unique nonce; the derivation and encoding MUST be specified by the encryption method. Without a defined nonce scheme, GMAC verification is undefined. |`segmentSize`|Number|The size of the segment. This field is optional. The size of the segment is inferred from 'segmentSizeDefault' defined above, but in the event that a segment were modified and re-encrypted, the segment size would change.| |`encryptedSegmentSize`|Number|The size of the segment (in bytes) after the payload segment has been encrypted.| diff --git a/schema/OpenTDF/json-schema/schema.json b/schema/OpenTDF/json-schema/schema.json index 8e8e894..441462b 100644 --- a/schema/OpenTDF/json-schema/schema.json +++ b/schema/OpenTDF/json-schema/schema.json @@ -108,13 +108,9 @@ "isStreamable": { "description": "Designates whether or not the payload is streamable.", "type": "boolean" - }, - "iv": { - "description": "Base64-encoded IV/nonce for the payload encryption algorithm. For AES-GCM, 12 bytes. Used as the base nonce for streamable encryption.", - "type": "string" } }, - "required": ["algorithm", "isStreamable", "iv"] + "required": ["algorithm", "isStreamable"] }, "integrityInformation": { "type": "object", diff --git a/schema/OpenTDF/method.md b/schema/OpenTDF/method.md index c807dbd..c52abf7 100644 --- a/schema/OpenTDF/method.md +++ b/schema/OpenTDF/method.md @@ -18,7 +18,4 @@ The `method` object, nested within [`encryptionInformation`](./encryption_inform | ------------ | ------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | | algorithm | String | The symmetric encryption algorithm used. AES-256-GCM is the recommended and commonly implemented algorithm. | Yes | | isStreamable | Boolean | Indicates if the payload was encrypted in segments suitable for streaming decryption. If true, [integrityInformation](./integrity_information.md) MUST contain segment details. | Yes | -| iv | String | The Base64 encoded Initialization Vector (IV) used with the symmetric algorithm. MUST be unique for each TDF encrypted with the same key. For AES-GCM, this MUST be 12 bytes (96 bits). | Yes | - -**Streamable AES-GCM nonce derivation:** When `isStreamable` is true and `algorithm` is `AES-256-GCM`, `iv` is the base 96-bit nonce for segment 0. For segment index `i` (starting at 0), derive the nonce as: -`nonce = iv[0..7] || uint32_be(iv[8..11] + i)`. Implementations MUST reject if the 32-bit counter overflows or wraps. +| iv | String | The Base64 encoded Initialization Vector (IV) used with the symmetric algorithm. MUST be unique for each TDF encrypted with the same key. For AES-GCM, MUST be 12 bytes (96 bits). | Yes |