Skip to content

fix(C02,C03,C07,C11,C13): add implementability conditioners and scope…#847

Open
vtknightmare wants to merge 1 commit into
OWASP:mainfrom
vtknightmare:vtknightmare/fix/implementability-conditioners-and-scope-notes
Open

fix(C02,C03,C07,C11,C13): add implementability conditioners and scope…#847
vtknightmare wants to merge 1 commit into
OWASP:mainfrom
vtknightmare:vtknightmare/fix/implementability-conditioners-and-scope-notes

Conversation

@vtknightmare
Copy link
Copy Markdown
Collaborator

… notes

Preface rule: 'Sufficient implementation guidance or tooling must exist to allow both implementation and effective verification.'

  • C02 2.1.3: add conditioner 'where the use case supports character set restriction'; without this the control fails for general-purpose AI where the valid character set is all of Unicode
  • C07 7.2.1: broaden accepted mechanisms; most LLM APIs (GPT-5, Claude, Gemini) do not expose calibrated confidence scores; retrieval-based verification and tool-based fact-checking are valid alternatives
  • C03 3.1.2: clarify applies to self-hosted artifacts only; for hosted models where the org does not control signing, see C3.5
  • C03 C3.6 header: add scope note that this section applies only to orgs running their own fine-tuning or RLHF pipelines; most adopters using hosted models are not in scope
  • C11 11.3.2: add conditioner that DP-SGD applies where data sensitivity justifies the utility trade-off; DP-SGD imposes significant accuracy cost and is not appropriate for all workloads
  • C13 13.6.1: replace vague 'security-validated' with explicit C9.7.1 cross-reference (the specific mechanism that satisfies this control)
  • C13 13.6.2: replace 'threat landscape assessment' with three specific evaluable signals (authorization scope, anomaly flags, action class)

@RicoKomenda
Copy link
Copy Markdown
Collaborator

The diff has 5 of the 7 changes listed in the description. C02 2.1.3 and C03 3.1.2 aren't here, did those not get pushed?

On the changes that are present: the C3.6 scope note and the two C13 rewrites are good, and the C9.7.1 / C14.2.1 references resolve correctly. Two small things:

  • 11.3.2: "where the sensitivity justifies the utility trade-off" is open-ended and hard for an auditor to test. Could it be tied to a defined data-sensitivity classification instead?
  • 13.6.1 now reads almost identically to C9.7.1, but it sits in C13 with no logging angle. Worth either adding the logging dimension or just deferring to C9.7.1.

@vtknightmare vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from 1b84ba1 to f664ebd Compare June 2, 2026 16:03
@vtknightmare
Copy link
Copy Markdown
Collaborator Author

The diff has 5 of the 7 changes listed in the description. C02 2.1.3 and C03 3.1.2 aren't here, did those not get pushed?

On the changes that are present: the C3.6 scope note and the two C13 rewrites are good, and the C9.7.1 / C14.2.1 references resolve correctly. Two small things:

* 11.3.2: "where the sensitivity justifies the utility trade-off" is open-ended and hard for an auditor to test. Could it be tied to a defined data-sensitivity classification instead?

* 13.6.1 now reads almost identically to C9.7.1, but it sits in C13 with no logging angle. Worth either adding the logging dimension or just deferring to C9.7.1.

C02 2.1.3 and C03 3.1.2 were missing for the same reason as #846, now included.
On 11.3.2: tied it to a defined classification so an auditor has something concrete to test against:
Verify that training on datasets classified as high-sensitivity under the organization's data classification policy employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon).
On 13.6.1: reframed as a logging control so it has a reason to exist in C13 rather than restating the C9.7.1 gate:
Verify that proactive agent behaviors, and the outcome of their pre-execution policy evaluation performed per C9.7.1, are recorded in the monitoring log with sufficient context to reconstruct the proposed action, the evaluation decision, and the basis for that decision.
Force-pushed.

@RicoKomenda
Copy link
Copy Markdown
Collaborator

Both missing changes are in now. The C03 3.1.2 self-hosted scoping (with the C3.5 pointer) is a clean clarification, and the C02 2.1.3 conditioner reads as legitimate scoping here since charset allow-listing can't apply to general Unicode input. No new concerns beyond the 11.3.2 and 13.6.1 notes from my earlier comment. Looks good.

@vtknightmare vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from f664ebd to 2bbc2fe Compare June 2, 2026 19:44
Comment thread 1.0/en/0x10-C07-Model-Behavior.md Outdated
| # | Description | Level |
| :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
| **7.2.1** | **Verify that** the system assesses the reliability of generated answers using a confidence or uncertainty estimation method (e.g., confidence scoring, retrieval-based verification, or model uncertainty estimation). | 1 |
| **7.2.1** | **Verify that** the system assesses the reliability of generated answers using any available mechanism appropriate to the deployment (e.g., confidence scoring where the API exposes it, retrieval-based verification against authoritative sources, model uncertainty estimation, or tool-invocation-based fact-checking). | 1 |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would phrase the change slightly differently:

... any available mechanism appropriate to the deployment...

To

... a mechanism appropriate to the deployment

| :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
| **11.3.1** | **Verify that** model outputs are calibrated (e.g., via temperature scaling or output perturbation) to reduce overconfident predictions that facilitate membership-inference attacks. | 2 |
| **11.3.2** | **Verify that** training on sensitive datasets employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon). | 2 |
| **11.3.2** | **Verify that** training on datasets classified as high-sensitivity under the organization's data classification policy employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon). | 2 |
Copy link
Copy Markdown
Collaborator

@ottosulin ottosulin Jun 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original wording is ambiguous, but this won't change that necessarily - this still allows the organization to define whatever they want to say is "sensitive". I.e. both original and revised version allow the organization to define it.

On the other hand, the change now requires a technical auditor to interpret data classification policy, moving to the GRC territory we want to stay away from.

If we want something less ambiguous, we should consider revising that with something like:

... training on sensitive (PII, trade secrets, medical information or similar) ...

If not, I would not change this.

| **13.6.1** | **Verify that** proactive agent behaviors are security-validated before execution with risk assessment integration. | 1 |
| **13.6.2** | **Verify that** autonomous initiative triggers include security context evaluation and threat landscape assessment. | 2 |
| **13.6.1** | **Verify that** proactive agent behaviors, and the outcome of their pre-execution policy evaluation performed per C9.7.1, are recorded in the monitoring log with sufficient context to reconstruct the proposed action, the evaluation decision, and the basis for that decision. | 1 |
| **13.6.2** | **Verify that** autonomous initiative triggers include security context evaluation covering the agent's current authorization scope, any active anomaly flags or incident status, and the action classification per the human oversight policy (C14.2.1). | 2 |
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is very good addition, the original wording is very ambiguous and not yet actionable, but I would leave out the any active anomaly flags or incident status,

I don't think this is realistic to implement because that would require two-way integration between one or more security monitoring systems in a normal enterprise environment.

Preface rule: sufficient guidance/tooling must exist for both implementation
and verification.

- C02 2.1.3: conditioner 'where the use case supports character set
  restriction' (general-purpose AI accepts all of Unicode).
- C07 7.2.1: broaden to 'a mechanism appropriate to the deployment'; 'any
  available' removed (review feedback); examples cover APIs that do/don't
  expose confidence scores.
- C03 3.1.2: scope to self-hosted artifacts; for hosted models see C3.5.
- C03 C3.6: scope note (section applies only to orgs running their own
  fine-tuning / RLHF pipelines).
- C11 11.3.2 (review): enumerate sensitive data types directly (PII, trade
  secrets, medical information) instead of referencing a data classification
  policy; avoids requiring auditors to interpret GRC documents.
- C13 13.6.1 (review): reframe with a real logging dimension that records the
  proactive action and its C9.7.1 policy-evaluation outcome in the monitoring
  log, rather than restating the C9.7.1 gate.
- C13 13.6.2 (review): replace 'threat landscape assessment' with two concrete
  evaluable signals (authorization scope, action class per C14.2.1); 'anomaly
  flags / incident status' dropped as it requires two-way SIEM integration
  that is unrealistic in most enterprise environments.
@vtknightmare vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from 2bbc2fe to 037a5a3 Compare June 2, 2026 20:00
@vtknightmare
Copy link
Copy Markdown
Collaborator Author

Updated and force-pushed.

7.2.1 now says “a mechanism appropriate to the deployment” instead of “any available mechanism.”

11.3.2 now names the data types directly: PII, trade secrets, medical information, or similar. This avoids pushing the ambiguity into GRC interpretation and makes the requirement easier to audit.

13.6.2 no longer requires checking active anomaly flags or incident status. That expectation is not realistic without two-way SIEM integration. The requirement now stays focused on authorization scope and action classification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants