Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .well-known/agents-shipgate.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"trust_model": "static_by_default",
"schemas": {
"manifest": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/manifest-v0.1.json",
"report": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/report-schema.v0.20.json",
"report": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/report-schema.v0.21.json",
"packet": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/packet-schema.v0.6.json",
"checks_catalog": "https://raw.githubusercontent.com/ThreeMoonsLab/agents-shipgate/main/docs/checks.json"
},
Expand Down
10 changes: 6 additions & 4 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,8 +259,9 @@ Other stable top-level fields:
- `release_decision.contribution_rules[]` (v0.17+, per-finding audit of how each finding contributed to the decision; one row per `report.findings` entry, with `category` ∈ `{blocker, review_item, excluded}` and `rule` ∈ `{policy_block_new, severity_block_new, policy_baseline_accepted, severity_baseline_accepted, review_required, sub_threshold, suppressed}`)
- `policy_audit.severity_overrides_applied[]` (v0.17+, top-of-report audit envelope listing every manifest-driven severity override with `{check_id, default_severity, applied_severity, manifest_path, reason, tier_crossed, direction, expires}`)
- `privacy_audit` (v0.18+, top-level audit proving default redaction ran before public artifacts were written; `redacted_paths[]` contains counts and structural paths only, never raw values or raw hashes)
- `heuristics_filter` (v0.21+, top-level audit envelope describing the `--no-heuristics` CLI filter pass; `enabled` is `False` and counts are zero when the flag is unset, so the field shape is stable across runs. When enabled, findings whose `provenance_kind` is `keyword_heuristic` or `regex_heuristic` are marked `suppressed=True` with `suppression_reason="filtered by --no-heuristics"` before the release decision is built — they remain in `findings[]` for transparency but do not gate release.)

The full schema is at [`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json) (current; emitted reports carry `report_schema_version: "0.20"`, adding the top-level `reviewer_summary` block — a deterministic projection of reviewer-lens surfaces and audit envelopes). v0.19 (frozen at [`docs/report-schema.v0.19.json`](docs/report-schema.v0.19.json)) adds `Finding.policy_evidence_source` and `ReleaseDecisionItem.{source, policy_evidence_source}` for reviewer-grade dual-source provenance on top of v0.18's `privacy_audit`, v0.17's `policy_audit`, and `release_decision.contribution_rules[]` audit fields. What's-stable is documented in [STABILITY.md](STABILITY.md).
The full schema is at [`docs/report-schema.v0.21.json`](docs/report-schema.v0.21.json) (current; emitted reports carry `report_schema_version: "0.21"`, adding the top-level `heuristics_filter` audit envelope alongside v0.20's `reviewer_summary` block). v0.20 (frozen at [`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json)) added the `reviewer_summary` deterministic projection of reviewer-lens surfaces and audit envelopes. v0.19 added `Finding.policy_evidence_source` and `ReleaseDecisionItem.{source, policy_evidence_source}` for reviewer-grade dual-source provenance on top of v0.18's `privacy_audit`, v0.17's `policy_audit`, and `release_decision.contribution_rules[]` audit fields. What's-stable is documented in [STABILITY.md](STABILITY.md).

**Release gating signal**: prefer `release_decision.decision` (`"blocked" | "review_required" | "insufficient_evidence" | "passed"`) over `summary.status`. The new field is **baseline-aware** — a baseline-matched critical surfaces in `release_decision.review_items` (accepted debt), not `release_decision.blockers`. `summary.status` stays baseline-blind for v0.7 compatibility, so a baseline-matched-only critical produces both `summary.status = "release_blockers_detected"` AND `release_decision.decision = "review_required"` (intentional divergence — see [STABILITY.md](STABILITY.md#release_decisiondecision-vs-summarystatus)). `insufficient_evidence` (added v0.14) signals that the scan saw too many low-confidence tools or source-loader warnings to be trustworthy; consumers that switch on the enum must fall back to `review_required` for unknown future values.

Expand Down Expand Up @@ -326,7 +327,7 @@ validation and [`docs/manifest-v0.1.md`](docs/manifest-v0.1.md) for prose.
### Where is the report schema?

Parse `agents-shipgate-reports/report.json` and validate against
[`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json) (current).
[`docs/report-schema.v0.21.json`](docs/report-schema.v0.21.json) (current).
Older reports (`report_schema_version: "0.10"`) validate against the
frozen [`docs/report-schema.v0.10.json`](docs/report-schema.v0.10.json).
Do not scrape Markdown when JSON is available.
Expand Down Expand Up @@ -364,7 +365,8 @@ For the short, current statement of "which fields to read", see [`docs/agent-con
| What | Path | Stable |
|---|---|---|
| Manifest schema | [`docs/manifest-v0.1.json`](docs/manifest-v0.1.json) | `0.1` |
| Report schema (current) | [`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json) | `0.20` |
| Report schema (current) | [`docs/report-schema.v0.21.json`](docs/report-schema.v0.21.json) | `0.21` |
| Report schema (v0.20 frozen reference) | [`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json) | `0.20` |
| Report schema (v0.19 frozen reference) | [`docs/report-schema.v0.19.json`](docs/report-schema.v0.19.json) | `0.19` |
| Report schema (v0.18 frozen reference) | [`docs/report-schema.v0.18.json`](docs/report-schema.v0.18.json) | `0.18` |
| Report schema (v0.17 frozen reference) | [`docs/report-schema.v0.17.json`](docs/report-schema.v0.17.json) | `0.17` |
Expand Down Expand Up @@ -399,7 +401,7 @@ Promised to not break in `0.x` minor versions. See [STABILITY.md](STABILITY.md)

| Command | Stable flags |
|---|---|
| `agents-shipgate scan` | `-c`, `--out`, `--format`, `--ci-mode`, `--fail-on`, `--baseline`, `--diff-from`, `--no-plugins`, `--verbose`, `--packet`/`--no-packet`, `--packet-format` |
| `agents-shipgate scan` | `-c`, `--out`, `--format`, `--ci-mode`, `--fail-on`, `--baseline`, `--diff-from`, `--no-plugins`, `--no-heuristics`, `--verbose`, `--packet`/`--no-packet`, `--packet-format` |
| `agents-shipgate evidence-packet` | `--from`, `--out`, `--format`, `--json` |
| `agents-shipgate init` | `--workspace`, `--write`, `--json` |
| `agents-shipgate doctor` | `-c`, `--workspace`, `--json`, `--verbose` |
Expand Down
54 changes: 54 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,60 @@

## Unreleased

- **v0.21 — `--no-heuristics` CLI flag closes the round-3 / round-4 E5
carryover.** `Finding.provenance_kind` has shipped on every report since
v0.15 as required+non-nullable wire metadata but had no consumer for
four review cycles. v0.21 lands the consumer the field was always
designed for: a security/GRC-friendly filter that excludes findings
whose provenance is `keyword_heuristic` or `regex_heuristic` from the
active release-gating set.
- New `--no-heuristics` flag on `agents-shipgate scan` (stable in
0.x). When set, findings whose `provenance_kind` is in
`NO_HEURISTICS_EXCLUDED_PROVENANCE_KINDS` (today: `keyword_heuristic`
and `regex_heuristic`) are marked `suppressed=True` with
`suppression_reason="filtered by --no-heuristics"` BEFORE the
release decision is built. Filtered findings remain in `findings[]`
for transparency; they no longer gate release. The KEEP list is
`static_declaration`, `ast_extraction`, and `policy_pack` —
declared/parsed-shape findings and explicit external rules stay in
scope.
- New top-level `report.heuristics_filter` audit envelope. Required +
always present on emitted scans regardless of whether the flag was
set (parallel to `privacy_audit` shape). Fields: `enabled`,
`excluded_provenance_kinds: list[str]`, `filtered_finding_count`,
`filtered_by_kind: dict[str, int]`. Earns the contract weight of
`Finding.provenance_kind` by giving it a first-class consumer.
- Manifest-driven suppression wins on overlap: a finding the user
explicitly suppressed via `checks.ignore` keeps the user's reason
text even when its provenance_kind would have triggered the
filter. The audit envelope still counts the overlap so reviewers
see the filter's effective scope.
- `ReviewerSummary` lens/audit counts already reflect the post-filter
active set (the filter runs before `build_reviewer_summary`); no
new field added to `ReviewerSummary` — the dedicated envelope is
the right audit home.
- Schema bump: `report_schema_version: "0.20"` → `"0.21"`. v0.20 moves
to frozen-reference; existing v0.20 consumers ignore the new field.
- Contract-stamp pin in `docs/architecture.md` bumped to date
`2026-05-23`, report `v0.21`, packet `v0.6` (unchanged). The
`test_architecture_doc_contract_stamp_matches_runtime` regression
test moves in lockstep.
- 11 new tests in `tests/test_no_heuristics.py` covering: pure-
function filter semantics (KEEP / FILTER classifications per
provenance_kind), envelope shape parity across enabled=True/False,
manifest-suppression preservation, contract-list completeness
(every value in `NO_HEURISTICS_EXCLUDED_PROVENANCE_KINDS` is a
real `ProvenanceKind`; KEEP+EXCLUDE partition is exact), end-to-
end `run_scan(no_heuristics=True)`, CLI subprocess smoke test,
monotone-non-increasing reviewer-summary lens counts under
filtering.
- **Decision recorded.** Round-4 review's E5 carryover offered ship-
or-retire on `provenance_kind`. We ship. Retiring would have forced
a deprecation cycle on a stable-contract field used by every
report since v0.15; shipping the flag earns the weight and serves
a real audience (security/GRC reviewers triaging declared-only
findings before promotion).

- **v0.21 — CI coverage gate raised from 75% → 85% (E7 from round-4 review).**
Both `.github/workflows/ci.yml` and `.github/workflows/release.yml` now
pass `--cov-fail-under=85`. Aggregate coverage on `main` at the time of
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ Set `pr_comment: "true"` to post a compact PR summary:

## What it produces

- **Tool-Use Readiness Report** — `agents-shipgate-reports/report.{md,json,sarif}`. Markdown for human release review, JSON for tools and coding agents (current schema [v0.20](docs/report-schema.v0.20.json); gating signal is `release_decision.decision`; v0.20 adds the top-level `reviewer_summary` blocka deterministic projection of reviewer lens + audit surfaces parallel to `agent_summary`; v0.19 added reviewer-grade dual-source provenance on top of v0.18's privacy audit), SARIF for GitHub code-scanning workflows.
- **Tool-Use Readiness Report** — `agents-shipgate-reports/report.{md,json,sarif}`. Markdown for human release review, JSON for tools and coding agents (current schema [v0.21](docs/report-schema.v0.21.json); gating signal is `release_decision.decision`; v0.21 adds the top-level `heuristics_filter` envelopethe audit pass for the new `--no-heuristics` CLI flag — alongside v0.20's `reviewer_summary` block; v0.19 added reviewer-grade dual-source provenance on top of v0.18's privacy audit), SARIF for GitHub code-scanning workflows.
- **Release Evidence Packet** — `agents-shipgate-reports/packet.{md,json,html}` (and `packet.pdf` with the `[pdf]` extras). Reviewer-shaped synthesis with fixed sections, including the compact evidence matrix plus tool-surface and action-surface diffs when available. Packet outputs are locally redacted by default and governed by [packet schema v0.6](docs/packet-schema.v0.6.json) — see [STABILITY.md §Release Evidence Packet](STABILITY.md#release-evidence-packet-v06).

## Exit codes
Expand Down Expand Up @@ -256,7 +256,7 @@ Agents Shipgate is designed to be agent-friendly. If you're a coding agent (Clau
- **[`prompts/`](prompts/)** — reusable prompts for common workflows
- **[`skills/agents-shipgate/`](skills/agents-shipgate/)** + **[`.claude/commands/shipgate.md`](.claude/commands/shipgate.md)** — self-contained Claude Code skill (bundled prompts and CI recipe) and `/shipgate` slash command. See [`docs/agents/use-with-claude-code.md`](docs/agents/use-with-claude-code.md) to install in your own project.
- **[`docs/ai-search-summary.md`](docs/ai-search-summary.md)** — human-readable summary for AI search, answer engines, and coding agents
- **[`docs/manifest-v0.1.json`](docs/manifest-v0.1.json)** + **[`docs/report-schema.v0.20.json`](docs/report-schema.v0.20.json)** — JSON Schemas for live editor validation (current; emitted reports carry `report_schema_version: "0.20"`). v0.20 adds the top-level `reviewer_summary` block (lens + audit activity counts plus a `first_recommended_surface` pointer, parallel to `agent_summary` for the reviewer side); v0.19 added `Finding.policy_evidence_source` and `ReleaseDecisionItem.{source, policy_evidence_source}` for reviewer-grade dual-source provenance; v0.18 added `privacy_audit`; v0.17 added `policy_audit` and `release_decision.contribution_rules[]`. Read `release_decision.decision` for release gating in new consumers; read `agent_summary.first_recommended_action` for a deterministic next agent step and `reviewer_summary.first_recommended_surface` for the recommended human-review entry point.
- **[`docs/manifest-v0.1.json`](docs/manifest-v0.1.json)** + **[`docs/report-schema.v0.21.json`](docs/report-schema.v0.21.json)** — JSON Schemas for live editor validation (current; emitted reports carry `report_schema_version: "0.21"`). v0.21 adds the top-level `heuristics_filter` envelope — the audit pass for the new `--no-heuristics` CLI flag — alongside v0.20's `reviewer_summary` block (lens + audit activity counts plus a `first_recommended_surface` pointer, parallel to `agent_summary` for the reviewer side); v0.19 added `Finding.policy_evidence_source` and `ReleaseDecisionItem.{source, policy_evidence_source}` for reviewer-grade dual-source provenance; v0.18 added `privacy_audit`; v0.17 added `policy_audit` and `release_decision.contribution_rules[]`. Read `release_decision.decision` for release gating in new consumers; read `agent_summary.first_recommended_action` for a deterministic next agent step and `reviewer_summary.first_recommended_surface` for the recommended human-review entry point.
- **[`docs/checks.json`](docs/checks.json)** — machine-readable check catalog

Every command has a `--json` form. Errors emit a structured `next_action` line on stderr when `AGENTS_SHIPGATE_AGENT_MODE=1`.
Expand Down Expand Up @@ -444,7 +444,7 @@ Agents Shipgate is a static, manifest-first scanner. It is intentionally narrow:
- It does not verify runtime behavior, latency, prompt quality, or routing decisions.
- It does not replace dynamic security testing or human security review of the underlying systems.
- It only inspects what is declared in `shipgate.yaml`, local OpenAPI specs, MCP exports, Anthropic/OpenAI API artifacts, optional SDK AST metadata, static Google ADK/LangChain/CrewAI/n8n inputs, and static Codex plugin package metadata; tools that are not declared or statically discoverable are not scanned.
- The manifest remains `version: "0.1"` so existing configs keep working. Current reports carry `report_schema_version: "0.20"` (additive over v0.19's dual-source provenance, adding the `reviewer_summary` top-level block — a deterministic projection of reviewer lens activity and audit envelope counts plus a `first_recommended_surface` pointer) while preserving the stable payload contract documented in the report schema.
- The manifest remains `version: "0.1"` so existing configs keep working. Current reports carry `report_schema_version: "0.21"` (additive over v0.20's `reviewer_summary`, adding the `heuristics_filter` top-level audit envelope — the deterministic projection of the `--no-heuristics` CLI filter pass) while preserving the stable payload contract documented in the report schema.

See [ROADMAP.md](ROADMAP.md) for what is planned next.

Expand Down Expand Up @@ -521,7 +521,7 @@ readers and AI search ingest.
- [Check catalog](docs/checks.md)
- [Policy packs](docs/policy-packs.md)
- [Baseline workflow](docs/baseline.md)
- [JSON report schema v0.20](docs/report-schema.v0.20.json)
- [JSON report schema v0.21](docs/report-schema.v0.21.json)
- [Privacy and redaction](docs/privacy.md)
- [Trust model](docs/trust-model.md)
- [AI search summary](docs/ai-search-summary.md)
Expand Down
Loading