Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,57 @@

## Unreleased

- **v0.20 — third-party adapter entry-point discovery (E4 from round-3 review).**
Opens the same extension surface for adapters (input loaders) that M5
already opened for check plugins. Discovery is gated by the existing
`AGENTS_SHIPGATE_ENABLE_PLUGINS=1` env var and `--no-plugins` CLI flag.
- New entry-point group: `agents_shipgate.adapters`. A third-party
package declares an adapter class (or instance) in its
`pyproject.toml` under
`[project.entry-points."agents_shipgate.adapters"]`; the class must
satisfy the `ToolSourceAdapter` Protocol — `source_type` ClassVar,
`scope` ClassVar (`per_source` or `per_scan`), `artifact_class`
ClassVar, and a `load(source, base_dir, manifest)` method.
- New module `src/agents_shipgate/inputs/adapter_validation.py` with
four load-time gates: `load_failed`, `bad_protocol`, `bad_scope`,
and **`source_type_collision`** — the load-bearing trust rule
rejecting any third-party adapter whose `source_type` shadows a
built-in or another already-registered third-party adapter.
- New top-level `discover_third_party_adapters(registry, *,
plugins_enabled, loaded_adapters)` in `inputs/protocol.py` walks
`entry_points("agents_shipgate.adapters")`, validates each entry,
and registers the valid ones onto the supplied registry. Both
valid and invalid records surface in
`report.loaded_adapters[]` so reviewers can see what was skipped.
- New report field `loaded_adapters: list[dict[str, Any]]` parallel
to `loaded_plugins[]`. Items carry `name`, `value`, `distribution`,
`version`, `source_type`, `validation_status`,
`validation_errors[]`, `runtime_errors[]`. Required + present on
every emitted scan (empty list when `--no-plugins` or no
third-party adapters are installed). The schema generator marks
each item's eight fields as required.
- `--strict-plugins` (v0.17+) extended to cover adapter failures.
Any non-`valid` `loaded_adapters[]` row OR non-empty
`loaded_adapters[].runtime_errors` now elevates the scan to exit
code 4 alongside the existing plugin failures.
- `--no-plugins` flag help text updated to mention third-party
adapter discovery is also disabled.
- `run_validated_adapter` (in `adapter_validation.py`) provides a
runtime safety wrapper for callers that want to capture
exceptions into `loaded_adapters[].runtime_errors` instead of
propagating them. The dispatcher's existing `_absorb` artifact-
class check already fires `TypeError` for artifact smuggling;
runtime wrapping is opt-in for future adapter-execution paths.
- 21 new tests in `tests/test_adapter_entry_point_discovery.py`:
each of the four gates + valid-class + valid-instance + env-var
gating + `--no-plugins` overrides + collision-with-each-builtin
parametrize + collision-between-third-parties + `--strict-plugins`
end-to-end + runtime safety net (exception capture, wrong return
type, artifact smuggling).
- STABILITY.md gains a new "Third-party adapter discovery (v0.20+)"
subsection under "Trust-model invariants" documenting the four
gates + the `source_type_collision` load-bearing rule.

- **v0.20 — top-level `reviewer_summary` block.** Adds a deterministic
projection of the reviewer lens surfaces (`tool_surface_diff`,
capability/intent diff, `action_surface_diff`, evidence matrix) and
Expand Down
29 changes: 28 additions & 1 deletion STABILITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -361,7 +361,7 @@ If a contributor introduces a real need for one of the forbidden surfaces,
update this section in the same PR. The intent is not "we tried to forbid X"
— it is that X is *structurally absent* from the scanner's parsing path.

Plugins are off by default. `AGENTS_SHIPGATE_ENABLE_PLUGINS=1` enables loading; `--no-plugins` overrides at the CLI level. When loaded, every plugin is enumerated in `report.loaded_plugins`.
Plugins are off by default. `AGENTS_SHIPGATE_ENABLE_PLUGINS=1` enables loading; `--no-plugins` overrides at the CLI level. When loaded, every plugin is enumerated in `report.loaded_plugins`, and every third-party adapter (v0.20+) is enumerated in `report.loaded_adapters`.

Plugin validation (v0.17+ / M5). Every entry point is checked against five load-time gates before it can run:

Expand All @@ -373,6 +373,33 @@ Plugin validation (v0.17+ / M5). Every entry point is checked against five load-

Plugins that pass every gate run with the same trust as built-ins. Runtime validation additionally drops findings whose `Finding.check_id` does not match the plugin's declared `id`/`check_id`, drops non-`Finding` items, and captures any exception raised during the plugin call into `loaded_plugins[].runtime_errors`. The scan continues regardless; `--strict-plugins` elevates any non-`valid` plugin or non-empty `runtime_errors` to exit code 4.

#### Third-party adapter discovery (v0.20+)

Third-party adapters register through the `agents_shipgate.adapters` Python entry-point group and provide a class (or instance) satisfying the `ToolSourceAdapter` Protocol — a `source_type: str` ClassVar, a `scope: Literal["per_source", "per_scan"]` ClassVar, an `artifact_class: type | None` ClassVar, and a `load(source, base_dir, manifest)` method returning `LoadedAdapterResult`. Discovery is gated by the same `AGENTS_SHIPGATE_ENABLE_PLUGINS=1` env var as plugin checks; `--no-plugins` forces it off.

Every discovered entry point is checked against four load-time gates before it can register on the scan's adapter registry:

1. **load** — `entry_point.load()` must not raise. Captured as `validation_status="load_failed"`.
2. **bad_protocol** — the loaded value (a class is instantiated with no args; an instance is used directly) must have all three ClassVars (`source_type` non-empty string, `scope`, `artifact_class`) and a callable `load` method that accepts the three positional arguments `(source, base_dir, manifest)`: at least three positional slots (or `*args`), no more than three required positional parameters, and no required keyword-only parameters. Captured as `validation_status="bad_protocol"`.
3. **bad_scope** — `scope` must be exactly `"per_source"` or `"per_scan"`. Out-of-range values would be silently skipped by the dispatcher. Captured as `validation_status="bad_scope"`.
4. **source_type_collision** — the adapter's `source_type` must not shadow a built-in (`mcp`, `openapi`, `langchain`, etc.) or another third-party adapter discovered earlier in the same scan. **This is the load-bearing trust rule** — without it, a malicious plugin could displace a built-in adapter and intercept every scan targeting that source type. Captured as `validation_status="source_type_collision"`.

**Per-scan registry contract.** Adapters that pass every gate register on a **per-scan clone** of the global `REGISTRY` (built at the start of each `run_scan` / `inspect_sources` via `AdapterRegistry.clone()`), NOT on the global itself. The global stays builtin-only across the lifetime of the process. This guarantees two trust invariants:

- **`--no-plugins` is per-scan honest.** A later in-process scan with `plugins_enabled=False` sees a fresh builtin-only clone — no third-party adapters carried over from a prior enabled scan.
- **Collision detection is per-scan honest.** The collision set is the clone's builtins-only state, so two consecutive scans of the same valid third-party adapter both classify as `validation_status="valid"`, never as `source_type_collision` against the adapter's own previous registration.

The dispatcher walks the per-scan registry in the same pass-1 (per-source, in `tool_sources[]` declared order) / pass-2 (per-scan, in canonical registry order) loops it uses for built-ins. Two trust mechanisms protect the dispatch path:

- **Artifact-class smuggling prevention.** The dispatcher's `_absorb` step fires `TypeError` if any adapter (built-in or third-party) declares one `artifact_class` but returns an artifact of another type. This is the structural counterpart to the `Finding.check_id` smuggling rule for plugin checks.
- **Runtime-error capture for third-party adapters.** Third-party adapters that raise at runtime do NOT abort the scan. The dispatcher routes their `load()` call through `run_validated_adapter` (from `inputs/adapter_validation.py`), which catches every exception, captures it into `loaded_adapters[].runtime_errors` on the matching row, and signals the dispatcher to skip absorbing the (None) result. Built-in adapters keep the direct call shape — a built-in raising means the scanner itself is broken and must abort loudly.

`doctor` (`inspect_sources`) uses the same per-scan registry clone + discovery + dispatcher path as `scan`, so manifests referencing third-party `tool_sources[].type` values are introspectable. The doctor payload surfaces `loaded_adapters[]` alongside the existing `policy_packs` field.

`--strict-plugins` (v0.17+) covers BOTH plugin and adapter failures from v0.20+ — any non-`valid` `loaded_plugins[]` row, any non-empty `loaded_plugins[].runtime_errors`, any non-`valid` `loaded_adapters[]` row, OR any non-empty `loaded_adapters[].runtime_errors` elevates the scan to exit code 4. Default behavior remains lenient — failures are recorded in the respective provenance arrays and the scan proceeds.

**Manifest `tool_sources[].type`.** The field is `str` (relaxed from a closed `Literal` in v0.20) so manifests can reference third-party per-source adapters by name. Built-in source types are enumerated in `BUILTIN_TOOL_SOURCE_TYPES` for documentation and tooling; per-scan-only built-ins (`n8n`, `openai_api`, `anthropic_api`, `validation`) are still rejected at manifest-load time with a routable error pointing the user to the dedicated top-level manifest section. Unknown source types — both genuine third-party names with no registered adapter and typos of built-in names — fail with `ConfigError` (exit 2) when the dispatcher's `AdapterRegistry.require` cannot resolve them. The exit-2 contract is unchanged from prior releases; the failure layer (manifest-load vs dispatch) may differ.

### Manifest Schema

The manifest schema version (`version: "0.1"`) is independent of the CLI
Expand Down
1 change: 1 addition & 0 deletions docs/diagnostics.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ diagnostics where no command should run.
| `SHIP-DIAG-MISSING-SOURCE-FILE` | block | A required `tool_sources[].path` does not resolve under the manifest directory. (`doctor` no longer raises `InputParseError(3)` for this — see below.) |
| `SHIP-DIAG-CHANGE-ME-PLACEHOLDERS` | warn | Manifest text still contains `CHANGE_ME` markers. |
| `SHIP-DIAG-NO-PRODUCTION-PERMISSIONS` | warn | `environment.target: production` AND no permissions / scopes / policies declared. |
| `SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE` | block | Manifest references a `tool_sources[].type` that no registered adapter handles. Rank-1 action depends on plugin state: enable plugin discovery (`AGENTS_SHIPGATE_ENABLE_PLUGINS=1`) and install the third-party adapter package, or fix a typo. v0.20+. |

## Negative-control precedence

Expand Down
9 changes: 0 additions & 9 deletions docs/manifest-v0.1.json
Original file line number Diff line number Diff line change
Expand Up @@ -1502,15 +1502,6 @@
"title": "Trust"
},
"type": {
"enum": [
"mcp",
"openapi",
"openai_agents_sdk",
"google_adk",
"langchain",
"crewai",
"codex_plugin"
],
"title": "Type",
"type": "string"
}
Expand Down
19 changes: 19 additions & 0 deletions docs/report-schema.v0.20.json
Original file line number Diff line number Diff line change
Expand Up @@ -4251,6 +4251,24 @@
"title": "Generated Reports",
"type": "object"
},
"loaded_adapters": {
"items": {
"additionalProperties": true,
"required": [
"distribution",
"name",
"runtime_errors",
"source_type",
"validation_errors",
"validation_status",
"value",
"version"
],
"type": "object"
},
"title": "Loaded Adapters",
"type": "array"
},
"loaded_plugins": {
"items": {
"additionalProperties": true,
Expand Down Expand Up @@ -4386,6 +4404,7 @@
"findings",
"frameworks",
"generated_reports",
"loaded_adapters",
"loaded_plugins",
"loaded_policy_packs",
"misalignments",
Expand Down
1 change: 1 addition & 0 deletions docs/report-sensitive-fields.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
{"surface": "report", "path": "generated_reports", "classification": "path_metadata"},
{"surface": "report", "path": "loaded_policy_packs", "classification": "path_metadata"},
{"surface": "report", "path": "loaded_plugins", "classification": "path_metadata"},
{"surface": "report", "path": "loaded_adapters", "classification": "path_metadata"},
{"surface": "report", "path": "tool_inventory", "classification": "credential_metadata"},
{"surface": "report", "path": "source_warnings", "classification": "free_text"},
{"surface": "report", "path": "agent_summary", "classification": "free_text"},
Expand Down
30 changes: 30 additions & 0 deletions scripts/generate_schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,13 @@ def build_report_schema() -> tuple[Path, str]:
"generated_reports",
"loaded_policy_packs",
"loaded_plugins",
# v0.20: third-party adapter provenance (parallels
# loaded_plugins[]). Optional in Python via
# ``Field(default_factory=list)`` for test-helper minimal
# reports; emitted scans always populate it (empty list
# when --no-plugins is set or no third-party adapters are
# installed). Required + non-nullable on the wire.
"loaded_adapters",
"tool_inventory",
"source_warnings",
# v0.12: agent_summary is the deterministic top-level
Expand Down Expand Up @@ -817,6 +824,29 @@ def build_report_schema() -> tuple[Path, str]:
]
),
}
# v0.20: adapter validation provenance — parallel shape to
# loaded_plugins[] but the ID key is ``source_type`` (the dispatcher
# key) rather than ``check_id``. ``validation_status`` is one of
# ``valid | load_failed | bad_protocol | bad_scope |
# source_type_collision``; the two error lists are always present
# (empty for clean adapters).
if "loaded_adapters" in properties and properties["loaded_adapters"].get("type") == "array":
properties["loaded_adapters"]["items"] = {
"type": "object",
"additionalProperties": True,
"required": sorted(
[
"name",
"value",
"distribution",
"version",
"source_type",
"validation_status",
"validation_errors",
"runtime_errors",
]
),
}

# frameworks.{google_adk,langchain,crewai} surface counts. These are
# also list[dict[str, Any]]-shaped at the model level; v0.5 enumerated
Expand Down
Loading