fix(C02,C03,C07,C11,C13): add implementability conditioners and scope… by vtknightmare · Pull Request #847 · OWASP/AISVS

vtknightmare · 2026-06-02T13:45:38Z

… notes

Preface rule: 'Sufficient implementation guidance or tooling must exist to allow both implementation and effective verification.'

C02 2.1.3: add conditioner 'where the use case supports character set restriction'; without this the control fails for general-purpose AI where the valid character set is all of Unicode
C07 7.2.1: broaden accepted mechanisms; most LLM APIs (GPT-5, Claude, Gemini) do not expose calibrated confidence scores; retrieval-based verification and tool-based fact-checking are valid alternatives
C03 3.1.2: clarify applies to self-hosted artifacts only; for hosted models where the org does not control signing, see C3.5
C03 C3.6 header: add scope note that this section applies only to orgs running their own fine-tuning or RLHF pipelines; most adopters using hosted models are not in scope
C11 11.3.2: add conditioner that DP-SGD applies where data sensitivity justifies the utility trade-off; DP-SGD imposes significant accuracy cost and is not appropriate for all workloads
C13 13.6.1: replace vague 'security-validated' with explicit C9.7.1 cross-reference (the specific mechanism that satisfies this control)
C13 13.6.2: replace 'threat landscape assessment' with three specific evaluable signals (authorization scope, anomaly flags, action class)

RicoKomenda · 2026-06-02T14:12:55Z

The diff has 5 of the 7 changes listed in the description. C02 2.1.3 and C03 3.1.2 aren't here, did those not get pushed?

On the changes that are present: the C3.6 scope note and the two C13 rewrites are good, and the C9.7.1 / C14.2.1 references resolve correctly. Two small things:

11.3.2: "where the sensitivity justifies the utility trade-off" is open-ended and hard for an auditor to test. Could it be tied to a defined data-sensitivity classification instead?
13.6.1 now reads almost identically to C9.7.1, but it sits in C13 with no logging angle. Worth either adding the logging dimension or just deferring to C9.7.1.

vtknightmare · 2026-06-02T16:08:29Z

The diff has 5 of the 7 changes listed in the description. C02 2.1.3 and C03 3.1.2 aren't here, did those not get pushed?

On the changes that are present: the C3.6 scope note and the two C13 rewrites are good, and the C9.7.1 / C14.2.1 references resolve correctly. Two small things:
* 11.3.2: "where the sensitivity justifies the utility trade-off" is open-ended and hard for an auditor to test. Could it be tied to a defined data-sensitivity classification instead?

* 13.6.1 now reads almost identically to C9.7.1, but it sits in C13 with no logging angle. Worth either adding the logging dimension or just deferring to C9.7.1.

C02 2.1.3 and C03 3.1.2 were missing for the same reason as #846, now included.
On 11.3.2: tied it to a defined classification so an auditor has something concrete to test against:
Verify that training on datasets classified as high-sensitivity under the organization's data classification policy employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon).
On 13.6.1: reframed as a logging control so it has a reason to exist in C13 rather than restating the C9.7.1 gate:
Verify that proactive agent behaviors, and the outcome of their pre-execution policy evaluation performed per C9.7.1, are recorded in the monitoring log with sufficient context to reconstruct the proposed action, the evaluation decision, and the basis for that decision.
Force-pushed.

RicoKomenda · 2026-06-02T18:36:42Z

Both missing changes are in now. The C03 3.1.2 self-hosted scoping (with the C3.5 pointer) is a clean clarification, and the C02 2.1.3 conditioner reads as legitimate scoping here since charset allow-listing can't apply to general Unicode input. No new concerns beyond the 11.3.2 and 13.6.1 notes from my earlier comment. Looks good.

ottosulin · 2026-06-02T19:35:00Z

 | # | Description | Level |
 | :--------: | --------------------------------------------------------------------------------------------------------------------- | :---: |
-| **7.2.1** | **Verify that** the system assesses the reliability of generated answers using a confidence or uncertainty estimation method (e.g., confidence scoring, retrieval-based verification, or model uncertainty estimation). | 1 |
+| **7.2.1** | **Verify that** the system assesses the reliability of generated answers using any available mechanism appropriate to the deployment (e.g., confidence scoring where the API exposes it, retrieval-based verification against authoritative sources, model uncertainty estimation, or tool-invocation-based fact-checking). | 1 |


I would phrase the change slightly differently:

... any available mechanism appropriate to the deployment...

To

... a mechanism appropriate to the deployment

ottosulin · 2026-06-02T19:40:54Z

 | :--------: | ------------------------------------------------------------------------------------------------------------------- | :---: |
 | **11.3.1** | **Verify that** model outputs are calibrated (e.g., via temperature scaling or output perturbation) to reduce overconfident predictions that facilitate membership-inference attacks. | 2 |
-| **11.3.2** | **Verify that** training on sensitive datasets employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon). | 2 |
+| **11.3.2** | **Verify that** training on datasets classified as high-sensitivity under the organization's data classification policy employs differentially-private optimization (e.g., DP-SGD) with a documented privacy budget (epsilon). | 2 |


The original wording is ambiguous, but this won't change that necessarily - this still allows the organization to define whatever they want to say is "sensitive". I.e. both original and revised version allow the organization to define it.

On the other hand, the change now requires a technical auditor to interpret data classification policy, moving to the GRC territory we want to stay away from.

If we want something less ambiguous, we should consider revising that with something like:

... training on sensitive (PII, trade secrets, medical information or similar) ...

If not, I would not change this.

ottosulin · 2026-06-02T19:44:55Z

-| **13.6.1** | **Verify that** proactive agent behaviors are security-validated before execution with risk assessment integration. | 1 |
-| **13.6.2** | **Verify that** autonomous initiative triggers include security context evaluation and threat landscape assessment. | 2 |
+| **13.6.1** | **Verify that** proactive agent behaviors, and the outcome of their pre-execution policy evaluation performed per C9.7.1, are recorded in the monitoring log with sufficient context to reconstruct the proposed action, the evaluation decision, and the basis for that decision. | 1 |
+| **13.6.2** | **Verify that** autonomous initiative triggers include security context evaluation covering the agent's current authorization scope, any active anomaly flags or incident status, and the action classification per the human oversight policy (C14.2.1). | 2 |


I think this is very good addition, the original wording is very ambiguous and not yet actionable, but I would leave out the any active anomaly flags or incident status,

I don't think this is realistic to implement because that would require two-way integration between one or more security monitoring systems in a normal enterprise environment.

Preface rule: sufficient guidance/tooling must exist for both implementation and verification. - C02 2.1.3: conditioner 'where the use case supports character set restriction' (general-purpose AI accepts all of Unicode). - C07 7.2.1: broaden to 'a mechanism appropriate to the deployment'; 'any available' removed (review feedback); examples cover APIs that do/don't expose confidence scores. - C03 3.1.2: scope to self-hosted artifacts; for hosted models see C3.5. - C03 C3.6: scope note (section applies only to orgs running their own fine-tuning / RLHF pipelines). - C11 11.3.2 (review): enumerate sensitive data types directly (PII, trade secrets, medical information) instead of referencing a data classification policy; avoids requiring auditors to interpret GRC documents. - C13 13.6.1 (review): reframe with a real logging dimension that records the proactive action and its C9.7.1 policy-evaluation outcome in the monitoring log, rather than restating the C9.7.1 gate. - C13 13.6.2 (review): replace 'threat landscape assessment' with two concrete evaluable signals (authorization scope, action class per C14.2.1); 'anomaly flags / incident status' dropped as it requires two-way SIEM integration that is unrealistic in most enterprise environments.

vtknightmare · 2026-06-02T20:02:21Z

Updated and force-pushed.

7.2.1 now says “a mechanism appropriate to the deployment” instead of “any available mechanism.”

11.3.2 now names the data types directly: PII, trade secrets, medical information, or similar. This avoids pushing the ambiguity into GRC interpretation and makes the requirement easier to audit.

13.6.2 no longer requires checking active anomaly flags or incident status. That expectation is not realistic without two-way SIEM integration. The requirement now stays focused on authorization scope and action classification.

vtknightmare requested review from RicoKomenda and ottosulin June 2, 2026 13:55

vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from 1b84ba1 to f664ebd Compare June 2, 2026 16:03

vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from f664ebd to 2bbc2fe Compare June 2, 2026 19:44

ottosulin requested changes Jun 2, 2026

View reviewed changes

vtknightmare force-pushed the vtknightmare/fix/implementability-conditioners-and-scope-notes branch from 2bbc2fe to 037a5a3 Compare June 2, 2026 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(C02,C03,C07,C11,C13): add implementability conditioners and scope…#847

fix(C02,C03,C07,C11,C13): add implementability conditioners and scope…#847
vtknightmare wants to merge 1 commit into
OWASP:mainfrom
vtknightmare:vtknightmare/fix/implementability-conditioners-and-scope-notes

vtknightmare commented Jun 2, 2026

Uh oh!

RicoKomenda commented Jun 2, 2026

Uh oh!

vtknightmare commented Jun 2, 2026

Uh oh!

RicoKomenda commented Jun 2, 2026

Uh oh!

ottosulin Jun 2, 2026

Uh oh!

ottosulin Jun 2, 2026 •

edited

Loading

Uh oh!

ottosulin Jun 2, 2026

Uh oh!

vtknightmare commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

vtknightmare commented Jun 2, 2026

Uh oh!

RicoKomenda commented Jun 2, 2026

Uh oh!

vtknightmare commented Jun 2, 2026

Uh oh!

RicoKomenda commented Jun 2, 2026

Uh oh!

ottosulin Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

ottosulin Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ottosulin Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

vtknightmare commented Jun 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ottosulin Jun 2, 2026 •

edited

Loading