Skip to content

Add third-party adapter entry-point discovery (v0.20 E4)#111

Merged
pengfei-threemoonslab merged 6 commits into
mainfrom
feat/v020-adapter-entry-points
May 22, 2026
Merged

Add third-party adapter entry-point discovery (v0.20 E4)#111
pengfei-threemoonslab merged 6 commits into
mainfrom
feat/v020-adapter-entry-points

Conversation

@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor

Summary

Closes E4 from the round-3 architecture review (Nov/Dec 2025). Opens the same extension surface for adapters (input loaders) that M5 already opened for plugin checks. Adapters now register through the agents_shipgate.adapters entry-point group, gated by the existing AGENTS_SHIPGATE_ENABLE_PLUGINS=1 env var and --no-plugins CLI flag.

Why

Third parties that wanted to ship a new input adapter (AutoGen, LlamaIndex, Semantic Kernel, additional ADK languages) had no path: the dispatcher's REGISTRY was builtin-only. The round-3 review noted M5 plugin validation made adapter discovery safe to land — this PR is that follow-up.

How

Four load-time gates in a new inputs/adapter_validation.py (parallel to checks/plugin_validation.py):

  1. load_failedentry_point.load() raised.
  2. bad_protocol — missing ClassVars, empty source_type, or wrong load() arity.
  3. bad_scopescope not in {per_source, per_scan}.
  4. source_type_collision — load-bearing trust rule. Adapter's source_type cannot shadow a built-in (mcp, openapi, langchain, …) or another third-party adapter registered earlier in the same scan. Without this rule, a malicious plugin could displace mcp and intercept every scan that loads an MCP source.

discover_third_party_adapters(registry, *, plugins_enabled, loaded_adapters) in inputs/protocol.py walks entry_points("agents_shipgate.adapters"), validates each, registers the valid ones, and appends per-entry info dicts to loaded_adapters[]. Discovery runs before _load_sources so the dispatcher resolves third-party source types alongside built-ins.

Report surface

New ReadinessReport.loaded_adapters: list[dict[str, Any]] field — parallel shape to loaded_plugins[], additive within the existing v0.20 schema. Required + present on every emitted scan; empty list when --no-plugins or no third-party adapters installed. Schema generator pins per-item required fields (name, value, distribution, version, source_type, validation_status, validation_errors, runtime_errors).

--strict-plugins (extended)

--strict-plugins now elevates exit code 4 on EITHER plugin failures (M5) OR adapter failures (this PR). The human-readable lines tag each failure as plugin or adapter. --no-plugins help text updated to mention adapter discovery is also disabled.

Runtime safety net

run_validated_adapter in inputs/adapter_validation.py wraps adapter.load() and captures (a) exceptions, (b) wrong return type, (c) artifact-class smuggling — adapter declares artifact_class=X but returns an artifact of another type. Mirrors the Finding.check_id smuggling rule from plugin_validation. The dispatcher's existing _absorb artifact-class check fires TypeError for built-ins; this wrapper is opt-in for future adapter-execution paths.

Test plan

  • 21 new tests in tests/test_adapter_entry_point_discovery.py:
    • valid third-party adapter end-to-end (class form, instance form)
    • each of the four gates exercised in isolation
    • source_type_collision parametrized across 4 built-in source types
    • source_type_collision between two third parties
    • gating by AGENTS_SHIPGATE_ENABLE_PLUGINS env (default off)
    • gating by --no-plugins overriding env (forces off)
    • --strict-plugins exits 4 end-to-end via CliRunner
    • run_validated_adapter catches exceptions, rejects wrong return type, rejects smuggled artifact
  • Full suite 1257 passing; ruff clean; coverage 88.00%
  • scripts/generate_schemas.py --check exits 0
  • STABILITY.md "Third-party adapter discovery (v0.20+)" subsection added under Trust-model invariants
  • tests/test_schema_boundaries.py wire-field-order pin updated for the new field
  • docs/report-sensitive-fields.json inventory extended for the privacy redaction layer

Out of scope (intentional)

  • Runtime wrapping in the dispatcher. The current _load_sources invokes adapter.load() directly. Wrapping every adapter with run_validated_adapter would change the contract for built-ins too; the wrapper is shipped as opt-in infrastructure for future adapter-execution work. The four load-time gates + the existing _absorb artifact-class check already cover the trust-bypass cases.
  • Sample-fixture regeneration. The 4 samples/*/expected/report.json files don't yet carry loaded_adapters: []. The sample-report tests only pin report_schema_version + release_decision.decision, so they pass; the next fixture regeneration will pick up the new field naturally.

🤖 Generated with Claude Code

pengfei-threemoonslab and others added 2 commits May 22, 2026 13:11
Closes E4 from the round-3 architecture review: opens the same
extension surface for adapters (input loaders) that M5 already
opened for check plugins. Discovery is gated by the existing
AGENTS_SHIPGATE_ENABLE_PLUGINS=1 env var and --no-plugins CLI flag,
so a single --no-plugins blocks both classes of third-party
extensions.

Entry-point group: ``agents_shipgate.adapters``. A third-party
package declares an adapter class (or instance) under
``[project.entry-points."agents_shipgate.adapters"]``; the class
must satisfy the ToolSourceAdapter Protocol — ``source_type``,
``scope`` (per_source or per_scan), ``artifact_class`` ClassVars
and a ``load(source, base_dir, manifest)`` method returning
LoadedAdapterResult.

Four load-time gates in ``inputs/adapter_validation.py``:

1. ``load_failed`` — entry_point.load() raised.
2. ``bad_protocol`` — loaded value not instantiable, missing
   required ClassVars, source_type not a non-empty string, or load
   method has wrong arity.
3. ``bad_scope`` — scope not in {per_source, per_scan}.
4. ``source_type_collision`` — source_type shadows a built-in
   (mcp, openapi, langchain, etc.) or another already-registered
   third-party adapter. This is the load-bearing trust rule —
   without it a malicious plugin could displace ``mcp`` and
   intercept every scan that loads an MCP source.

New top-level discovery API in inputs/protocol.py:
``discover_third_party_adapters(registry, *, plugins_enabled,
loaded_adapters)`` walks entry_points("agents_shipgate.adapters"),
validates each entry, registers the valid ones onto the supplied
registry, and appends per-entry info dicts to loaded_adapters[].
Invalid entries are NOT registered but still appear in
loaded_adapters[] so reviewers see what was skipped.

Report wiring (v0.20 additive within the existing v0.20 schema):

- New ``ReadinessReport.loaded_adapters: list[dict[str, Any]]``
  field parallel to loaded_plugins[]. Required + present on every
  emitted scan (empty list when --no-plugins or no third-party
  adapters installed). Item required fields pinned at the schema
  level: name, value, distribution, version, source_type,
  validation_status, validation_errors, runtime_errors.
- Threaded through ``_LoadedInputs``, ``_SanitizedSurfaces``, and
  ``build_report``. Sanitization runs through redact_data with
  path="loaded_adapters[]" (entry-point ``value`` strings are
  first-party and don't carry secrets, but the audit envelope
  flows through redaction for forward-compat with future adapter-
  emitted fields).
- ``cli/scan.py:_load_inputs`` now accepts plugins_enabled and
  invokes discovery BEFORE _load_sources so the dispatcher
  resolves third-party source_types alongside built-ins.

--strict-plugins extended (v0.20):

The existing --strict-plugins flag now also exits 4 on any
non-``valid`` loaded_adapters[] row or non-empty runtime_errors.
``_apply_strict_plugins`` in cli/_helpers.py concatenates plugin
and adapter failure messages and tags them ``plugin``/``adapter``
respectively in the human-readable lines. --no-plugins help text
updated to mention adapter discovery is also disabled.

Runtime safety net:

``run_validated_adapter`` in inputs/adapter_validation.py wraps
adapter.load() and captures: exceptions, wrong return type, and
artifact-class smuggling (declared artifact_class but returned
artifact of another type). Mirrors the Finding.check_id smuggling
rule from plugin_validation. The existing dispatcher _absorb
artifact-class check fires TypeError for built-ins; this wrapper
is opt-in for future adapter-execution paths.

STABILITY.md gains "Third-party adapter discovery (v0.20+)"
subsection documenting the four gates + the load-bearing
source_type_collision rule.

Tests (21 new in tests/test_adapter_entry_point_discovery.py):

- valid third-party adapter end-to-end (class form, instance form)
- each of the four gates exercised in isolation
- source_type_collision parametrized across 4 built-in source
  types (mcp, openapi, langchain, validation)
- source_type_collision between two third parties
- gating by AGENTS_SHIPGATE_ENABLE_PLUGINS env (default off)
- gating by --no-plugins overriding env (forces off)
- strict-plugins exits 4 end-to-end via CliRunner with
  the broken-adapter shape
- run_validated_adapter catches exceptions, rejects wrong
  return type, rejects smuggled artifact

Verification: 1257 tests pass; ruff clean; coverage 88.00%; schema
roundtrip --check exits 0; sample-report parity preserved (the
new loaded_adapters: [] field shows up in fresh scans but the
test_sample_expected_report_json_is_current pins only
report_schema_version + release_decision.decision, so existing
goldens stay valid).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… signature gate

Four findings from PR #111 review — three P1s and one P2. All four
addressed; 7 regression tests added (now 28 total in
test_adapter_entry_point_discovery.py).

P1 #1 — --no-plugins did not disable adapters after a prior enabled scan.
Pre-fix, ``discover_third_party_adapters`` mutated the module-global
``REGISTRY``. A later in-process scan with ``plugins_enabled=False``
skipped discovery but ``_load_sources`` still resolved the
already-registered adapter through that same global. Reproduced:
``report.loaded_adapters == []`` on scan two but the adapter still
ran. Trust boundary violated.

P1 #2 — repeated scans misclassified stable third-party adapters as
``source_type_collision``. Same root cause: collision detection used
``registry._adapters.keys()`` AFTER scan-one's third-party
registration had mutated the global. Scan two thus saw the same
adapter as a "built-in" collision while still executing it.

Fix for both: per-scan registry clone.

- New ``AdapterRegistry.clone()`` returns a shallow-copy registry
  with a fresh ``_adapters`` dict. Subsequent ``register()`` calls
  on the clone do NOT propagate to the global.
- ``_load_inputs`` now builds a per-scan registry via
  ``scan_registry = REGISTRY.clone()``, runs discovery against
  ``scan_registry``, and threads ``scan_registry`` through to
  ``_load_sources`` via a new ``registry=`` kwarg.
- ``_load_sources`` accepts the per-scan registry, falling back to
  the global only for legacy callers that bypass ``_load_inputs``
  (existing ``tests/test_adapter_registry.py`` patterns).
- Global ``REGISTRY`` stays builtin-only across the process. Tests
  that monkeypatch ``REGISTRY._adapters`` keep working because the
  clone captures the global's state at scan-start (after monkeypatch).

Two new regression tests pin the fix:

- ``test_no_plugins_disables_third_party_after_prior_enabled_scan``:
  scan-one runs with plugins enabled (adapter runs once); scan-two
  in the same process with ``plugins_enabled=False`` must see
  ``loaded_adapters == []`` AND the adapter must NOT run a second
  time. Pre-fix the invocation counter incremented; post-fix it
  stays at 1.
- ``test_second_scan_does_not_misclassify_third_party_as_collision``:
  two consecutive scans with plugins enabled must both report
  ``validation_status == "valid"`` on the same adapter. Pre-fix
  scan two reported ``source_type_collision``.

The autouse ``_reset_registry_after_test`` fixture is no longer
needed — discovery doesn't mutate the global — and is replaced with
a minimal ``_populate_registry`` fixture that just triggers lazy
population.

P1 #3 — per_source third-party adapters could not be referenced in
shipgate.yaml. ``ToolSourceConfig.type`` was a closed ``Literal`` of
built-in source types, so a manifest declaring
``type: my_third_party_source`` failed Pydantic validation BEFORE
discovery had a chance to register the loader. The v0.20 third-party
adapter surface was unusable for its main advertised use case.

Fix:

- ``ToolSourceConfig.type`` relaxed from ``Literal[...]`` to ``str``.
- New module-level constant ``BUILTIN_TOOL_SOURCE_TYPES`` enumerates
  the seven built-in source types for documentation and for tests
  that need to iterate the curated set.
- New constant ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` ({n8n,
  openai_api, anthropic_api, validation}) preserves the existing
  rejection: those four built-ins are per_scan-only and ingest config
  from a dedicated top-level manifest section, so they MUST NOT
  appear in ``tool_sources``. A new ``reject_per_scan_only_builtins``
  model validator on ``ToolSourceConfig`` raises ``ValueError`` at
  manifest-load with a routable error pointing the user to the
  top-level section.
- Unknown source types (typos like ``n8N``, genuinely-third-party
  types without a registered adapter) still fail at scan time with
  ``ConfigError`` from ``AdapterRegistry.require``. Exit-2 contract
  unchanged.
- ``tests/test_n8n.py::test_n8n_top_level_config_is_accepted_and_tool_source_type_is_rejected``
  updated: the case-variation (``type: n8N``) assertion now uses
  ``run_scan`` (dispatcher-time error) instead of ``load_manifest``
  (parse-time error). Same exit code, different layer.
- ``tests/test_adapter_registry.py`` + ``tests/harness/test_overlay_renderer.py``
  updated to iterate ``BUILTIN_TOOL_SOURCE_TYPES`` instead of
  ``get_args(ToolSourceConfig.type)``.
- New regression test
  ``test_per_source_third_party_adapter_referenced_from_manifest``:
  builds a manifest with ``type: demo_source``, registers a matching
  third-party per_source adapter via entry-points, asserts ``run_scan``
  succeeds and the adapter actually runs.

The JSON manifest schema (``docs/manifest-v0.1.json``) loses the
``enum`` constraint on ``tool_sources[].type`` — schema is now
``{"type": "string"}``. IDE autocomplete for built-in names is a
follow-up; the trust gate is the dispatcher, not the schema.

P2 #4 — signature validator accepted methods that crash the dispatcher.
``_protocol_error`` rejected only >3 required positional and zero
positional. Two failure modes slipped through:

- ``def load(self, source)`` — only 1 positional slot; dispatcher's
  ``adapter.load(source, base_dir, manifest)`` call fails with
  ``TypeError: too many positional arguments``.
- ``def load(self, source, base_dir, manifest, *, must_set)`` —
  required keyword-only param; dispatcher passes no kwargs.

Both now caught at the ``bad_protocol`` gate before registration.
``_protocol_error`` now:

- Computes ``accepting_slots`` = count of (required + optional)
  positional, and ``has_var_positional`` = whether ``*args`` is
  present.
- Rejects when ``accepting_slots < 3 AND not has_var_positional``.
- Rejects any required keyword-only parameter (mirrors plugin
  validation).

Four new regression tests:

- ``test_gate_bad_protocol_load_too_few_positional``: ``load(self,
  source)`` lands in ``bad_protocol`` with "at least 3 positional"
  in the error.
- ``test_gate_bad_protocol_load_required_keyword_only``: required
  kw-only lands in ``bad_protocol`` with "keyword-only" + the param
  name in the error.
- ``test_gate_bad_protocol_accepts_var_positional``: ``load(self,
  *args)`` is valid (can bind 3 args).
- ``test_gate_bad_protocol_accepts_optional_keyword_only``: optional
  kw-only with default is valid.

Verification: 1264 tests pass; ruff clean; schema roundtrip --check
exits 0; coverage 88.02%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

All four findings addressed in commit 14b943a. Three P1s required real structural changes; the P2 was a tightening. 7 new regression tests pin each fix.

P1 #1 + P1 #2 — Per-scan registry (was: global mutation)

You're right — mutating the module-global REGISTRY broke both the --no-plugins trust boundary AND collision detection across consecutive scans. Reproduced both your test cases.

Fix: new AdapterRegistry.clone() returns a shallow-copy registry with a fresh _adapters dict. _load_inputs now builds scan_registry = REGISTRY.clone() and threads it through _load_sources(..., registry=scan_registry). Discovery registers ONLY onto the clone; the global stays builtin-only across the process. Existing tests that monkeypatch REGISTRY._adapters keep working because clone() captures the global's state at the moment _load_inputs runs (after monkeypatch).

Two regression tests pin the fix:

  • test_no_plugins_disables_third_party_after_prior_enabled_scan — scan 1 enables the adapter (runs once); scan 2 in the same process with plugins_enabled=False must NOT increment the invocation counter. Asserts both report.loaded_adapters == [] AND _RecordingAdapter.invocations == 1.
  • test_second_scan_does_not_misclassify_third_party_as_collision — two consecutive scans of the same adapter both report validation_status == "valid". Pre-fix scan 2 would have reported source_type_collision while still executing the stale registration.

The autouse _reset_registry_after_test fixture I'd added with the original PR is gone — no cleanup needed since discovery doesn't mutate the global.

P1 #3ToolSourceConfig.type opened from Literal to str

You're right — the closed Literal made the v0.20 third-party adapter surface unusable for its main advertised use case. A manifest declaring type: my_third_party_source failed Pydantic validation before discovery could register the loader.

Fix:

  • ToolSourceConfig.type: Literal[...]type: str.
  • New module-level BUILTIN_TOOL_SOURCE_TYPES constant for documentation + tests that iterate the curated set.
  • New BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES = {"n8n", "openai_api", "anthropic_api", "validation"} + a reject_per_scan_only_builtins model validator preserves the safety rejection (your test_n8n_top_level_config_is_accepted_and_tool_source_type_is_rejected still passes for the exact type: n8n case at manifest-load time).
  • Unknown source types (typos, third-party without a registered adapter) still fail with ConfigError (exit 2) — now at dispatcher time via AdapterRegistry.require, not at parse time. The exit-code contract is unchanged. test_n8n*::n8N case updated to expect the failure at run_scan instead of load_manifest (same exit code, different layer).
  • New regression test test_per_source_third_party_adapter_referenced_from_manifest: builds a manifest with type: demo_source, registers a matching third-party per_source adapter via entry-points, asserts run_scan succeeds, the adapter actually executes, and report.loaded_adapters[0].validation_status == "valid".

The JSON manifest schema (docs/manifest-v0.1.json) loses the enum constraint on tool_sources[].type as a side effect. IDE autocomplete for built-in names is a follow-up; the trust gate is the dispatcher, not the schema.

P2 #4 — Tighter signature validation

You're right — _protocol_error only caught >3 required positional and zero positional. Two failure modes slipped through and would crash the dispatcher at runtime instead of failing at the bad_protocol gate:

  • def load(self, source) — only 1 positional slot; dispatcher's adapter.load(source, base_dir, manifest) would fail with TypeError: too many positional arguments.
  • def load(self, source, base_dir, manifest, *, must_set) — required keyword-only; dispatcher passes no kwargs.

Fix: _protocol_error now requires accepting_slots >= 3 OR has_var_positional, AND zero required keyword-only parameters. Four new regression tests:

  • test_gate_bad_protocol_load_too_few_positional — your example, expects bad_protocol with "at least 3 positional" in the error.
  • test_gate_bad_protocol_load_required_keyword_only — required kw-only is rejected; error mentions both "keyword-only" and the offending param name.
  • test_gate_bad_protocol_accepts_var_positionaldef load(self, *args) is valid (can bind 3 args).
  • test_gate_bad_protocol_accepts_optional_keyword_only — optional kw-only with default is valid.

Verification

  • 1264 tests pass (was 1257; +7 regression tests covering each finding).
  • ruff clean.
  • scripts/generate_schemas.py --check exits 0.
  • Coverage 88.02%.
  • The four findings reproduce against the previous commit 5400fb2; each is closed against 14b943a.

🤖 Generated with Claude Code

Three more findings from PR #111 review — two P1s and one P2. The
v0.20 third-party adapter surface now has parity across all three
consumers (scan dispatcher, doctor introspection, markdown report)
and the documented lenient/strict contract holds end-to-end. 4 new
regression tests, 32 total in test_adapter_entry_point_discovery.py.

P1 #1 — adapter runtime failures bypassed loaded_adapters.runtime_errors.
The dispatcher in ``_load_sources`` called ``adapter.load(...)``
directly for ALL adapters. ``run_validated_adapter`` existed in
``inputs/adapter_validation.py`` but was never invoked during a real
scan. A third-party adapter that raised at runtime aborted
``run_scan`` with the raw exception, no report was emitted, and
``--strict-plugins`` never saw the failure. Contradicted the
documented "lenient mode records the failure, continues" contract.

Fix:

- ``_load_inputs`` now builds a ``third_party_records: dict[str,
  LoadedAdapter]`` map from discovery (one entry per valid
  third-party adapter, keyed by source_type) and threads it to
  ``_load_sources(..., third_party_records=...)``.
- ``_load_sources`` (pass 1 / pass 2) checks each adapter's
  source_type against the map. Built-ins keep the direct
  ``adapter.load(...)`` call shape — a built-in raising is a
  scanner bug and must abort loudly. Third-party adapters route
  through ``run_validated_adapter`` instead, which catches every
  exception, captures it into ``LoadedAdapter.runtime_errors`` (and
  mirrors into the matching ``loaded_adapters[].runtime_errors``
  dict), and returns ``None`` so the dispatcher skips ``_absorb``.
- ``_invoke_per_source_adapter`` learned a ``third_party_record``
  kwarg that switches to the wrapper. Built-in optional-source
  ``InputParseError`` handling is preserved on the built-in branch.

Result: lenient-mode third-party crashes are captured, the scan
completes, the report contains the failed adapter row with
``runtime_errors`` populated, and ``--strict-plugins`` exits 4 as
documented.

P1 #2 — doctor couldn't introspect manifests using third-party
source types. ``inspect_sources`` (the doctor command's manifest-
introspection entry point) called ``_load_sources`` against the
global ``REGISTRY`` (which is intentionally builtin-only since the
PR #111 P1 #1+#2 per-scan-clone refactor) and did NOT run adapter
discovery. So a manifest with ``tool_sources[].type: demo_source``
that scanned cleanly now made ``doctor`` crash with
``ConfigError: No adapter registered for source type
'demo_source'``.

Fix:

- ``inspect_sources`` mirrors the ``_load_inputs`` pattern: build
  ``scan_registry = REGISTRY.clone()``, run
  ``discover_third_party_adapters(scan_registry, …)`` (honoring the
  same ``plugins_enabled`` env / override semantics), pass the
  per-scan registry and ``third_party_records`` to
  ``_load_sources``.
- New ``plugins_enabled: bool | None = None`` kwarg defaults to
  ``None`` so the existing env var governs (no doctor CLI flag
  change required).
- ``loaded_adapters`` surfaced in the doctor payload alongside
  ``policy_packs`` so an operator can confirm what was discovered
  without running a full scan.

P2 #3 — Markdown reports hid adapter validation failures.
``report/markdown.py`` rendered ``loaded_policy_packs`` and
``loaded_plugins`` but had no equivalent ``loaded_adapters``
section. A ``load_failed`` adapter appeared in ``report.json`` but
not in ``report.md`` — the default human report was blind to
skipped third-party extensions.

Fix:

- New ``_append_loaded_adapters(lines, report)`` renders a
  ``## Loaded Adapters`` section listing each entry as
  ``- distribution version: source_type — \`validation_status\``
  followed by per-error indented bullets (validation_error and
  runtime_error). Hidden when ``loaded_adapters`` is empty so
  clean repos see no extra noise.
- Wired into the main rendering flow right after
  ``_append_loaded_plugins``.

Regression tests (4 new in
tests/test_adapter_entry_point_discovery.py):

- ``test_runtime_error_in_third_party_adapter_captured_not_propagated``
  — a third-party adapter that raises ``RuntimeError`` at runtime
  doesn't abort ``run_scan``; the failure lands in
  ``loaded_adapters[].runtime_errors``; the scan exits 0.
- ``test_strict_plugins_exits_on_third_party_adapter_runtime_error``
  — end-to-end via CliRunner: ``--strict-plugins`` elevates exit
  code 4 specifically on the new adapter-runtime-error path.
- ``test_doctor_resolves_third_party_source_types`` — a manifest
  with ``type: demo_source`` and a matching third-party adapter is
  introspected successfully by ``inspect_sources``; the adapter
  runs; the doctor payload surfaces ``loaded_adapters[]``.
- ``test_markdown_report_renders_loaded_adapters_section`` — a
  ``load_failed`` adapter shows up in ``report.md`` under the
  ``## Loaded Adapters`` heading with its status and validation
  error.

Verification: 1268 tests pass; ruff clean; schema roundtrip --check
exits 0; coverage 88.06%.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Round-3 review findings addressed in commit 943afda. Two P1s and one P2; 4 new regression tests pin each.

P1 #1 — Runtime failures bypassed the provenance contract

You're right — run_validated_adapter existed in inputs/adapter_validation.py but was never invoked. The dispatcher in _load_sources called adapter.load(...) directly for everything, so a third-party adapter that raised at runtime aborted run_scan with the raw exception, no report was emitted, and --strict-plugins had no chance to see loaded_adapters[].runtime_errors.

Fix:

  • _load_inputs now builds a third_party_records: dict[str, LoadedAdapter] map (source_type → record) from discovery and threads it to _load_sources.
  • _load_sources branches per adapter: built-ins keep the direct call (their failures should abort loudly — a built-in raising is a scanner bug). Third-party adapters route through run_validated_adapter, which captures every exception into LoadedAdapter.runtime_errors (mirrored into the matching loaded_adapters[] dict) and returns None. The dispatcher skips _absorb on None and the scan continues.
  • _invoke_per_source_adapter learned a third_party_record kwarg that switches to the wrapper.

Two regression tests pin the fix:

  • test_runtime_error_in_third_party_adapter_captured_not_propagated — a RuntimeError raised in load() no longer crashes run_scan; lands in runtime_errors; exit 0 in lenient mode.
  • test_strict_plugins_exits_on_third_party_adapter_runtime_error — end-to-end via CliRunner, --strict-plugins exits 4 specifically on the runtime-error path.

P1 #2doctor couldn't introspect third-party source types

You're right — inspect_sources called _load_sources against the global REGISTRY (now builtin-only after the per-scan-clone refactor) with no discovery. A manifest with tool_sources[].type: demo_source scanned fine but crashed doctor with ConfigError: No adapter registered.

Fix:

  • inspect_sources now mirrors _load_inputs: REGISTRY.clone()discover_third_party_adapters(...) → pass the per-scan registry and third_party_records to _load_sources.
  • New plugins_enabled: bool | None = None kwarg defaults to None so the existing AGENTS_SHIPGATE_ENABLE_PLUGINS env var governs (no doctor CLI flag change needed yet).
  • loaded_adapters surfaced in the doctor payload alongside policy_packs so operators can confirm discovery results without running a full scan.

Regression test test_doctor_resolves_third_party_source_types builds the exact manifest you described, asserts inspect_sources returns a clean payload, the adapter actually ran, and loaded_adapters[] is surfaced.

P2 #3 — Markdown reports hid adapter validation failures

You're right — report/markdown.py rendered loaded_plugins but had no loaded_adapters section. load_failed adapters appeared only in report.json.

Fix: new _append_loaded_adapters(lines, report) renders a ## Loaded Adapters section. Each entry:

- <distribution> <version>: <source_type> — `<validation_status>`
  - validation_error: <message>
  - runtime_error: <message>

Hidden when loaded_adapters is empty so clean repos see no extra noise. Wired right after _append_loaded_plugins.

Regression test test_markdown_report_renders_loaded_adapters_section reproduces a load_failed adapter and asserts the section appears in report.md with both the status and the validation_error visible (accepting markdown's underscore-escape form so load_failed / load\_failed both satisfy the assertion).

Verification

  • 1268 tests pass (was 1264; +4 regression tests covering the three new findings).
  • ruff clean.
  • scripts/generate_schemas.py --check exits 0.
  • Coverage 88.06%.
  • All three findings reproduce against commit 14b943a; each is closed against 943afda.

🤖 Generated with Claude Code

…aged runtime wrapper (PR #111 review)

Two prose claims in the v0.20 "Third-party adapter discovery"
subsection were stale after the round-2 and round-3 review fixes:

1. "register on the live REGISTRY" — wrong since commit 14b943a.
   Adapters now register on a per-scan ``AdapterRegistry.clone()``
   built at the start of each ``run_scan`` / ``inspect_sources``.
   The global ``REGISTRY`` stays builtin-only across the process.

2. "captures runtime exceptions … for callers that wrap the
   invocation themselves" — wrong since commit 943afda. The
   dispatcher in ``_load_sources`` now routes EVERY third-party
   adapter ``load()`` call through ``run_validated_adapter`` (both
   pass 1 per-source and pass 2 per-scan loops). Callers do not
   need to opt in.

Rewrote the section to:

- Document the per-scan registry contract explicitly and the two
  invariants it guarantees (``--no-plugins`` honest across scans;
  collision detection honest across scans).
- Document the dispatcher's two trust mechanisms separately:
  artifact-class smuggling prevention (already covered for
  built-ins via ``_absorb`` ``TypeError``) and runtime-error
  capture (new third-party-only path through
  ``run_validated_adapter``).
- Note that ``doctor`` (``inspect_sources``) uses the same
  per-scan registry + discovery + dispatcher path as ``scan`` and
  surfaces ``loaded_adapters[]`` in its payload.
- Tighten the bad_protocol gate description to match the v0.20
  PR #111 P2 #4 fix: "at least 3 positional slots (or *args)" and
  "no required keyword-only parameters" are now both gated, in
  addition to the "no more than 3 required positional" rule.
- Add a new "Manifest ``tool_sources[].type``" paragraph
  documenting the Literal→str relaxation, the
  ``BUILTIN_TOOL_SOURCE_TYPES`` constant for documentation, the
  ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` rejection at
  manifest-load, and the ConfigError(2) dispatch-time fallback
  for unknown types.

Pure docs change. ruff clean; public-surface contract tests pass
(the existing drift guards already covered cross-doc consistency
for the wedge positioning, gating signal, and schema version —
none of which this edit touches).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Doc fix landed in commit bdbcae1 — STABILITY.md "Third-party adapter discovery (v0.20+)" subsection rewritten to match the code as merged on this branch.

You're right — two prose claims were stale relative to the review fixes:

  1. "register on the live REGISTRY" was wrong since commit 14b943a (the per-scan-clone fix for P1 Bump actions/checkout from 4.3.1 to 6.0.2 #1 + P1 Bump actions/github-script from 7.1.0 to 9.0.0 #2). Adapters now register on a per-scan AdapterRegistry.clone() built at the start of each run_scan / inspect_sources. The global REGISTRY stays builtin-only across the process lifetime.
  2. "captures runtime exceptions … for callers that wrap the invocation themselves" was wrong since commit 943afda (the dispatcher-managed wrapper for P1 Bump actions/checkout from 4.3.1 to 6.0.2 #1 review-round-3). The dispatcher in _load_sources now routes EVERY third-party adapter load() call through run_validated_adapter (both pass-1 per-source and pass-2 per-scan loops). Callers do not opt in.

Rewritten section now documents:

  • Per-scan registry contract with the two invariants it guarantees (--no-plugins honest across scans; collision detection honest across scans).
  • Dispatcher trust mechanisms, separated: artifact-class smuggling prevention (existing _absorb TypeError, applies to all adapters) and runtime-error capture (new third-party-only path through run_validated_adapter). Built-ins keep the direct call shape — a built-in raising means the scanner is broken and must abort loudly; this is now stated explicitly.
  • doctor parity: inspect_sources uses the same per-scan registry + discovery + dispatcher path and surfaces loaded_adapters[] in its payload.
  • bad_protocol gate description tightened to match the P2 Bump actions/setup-python from 5.6.0 to 6.2.0 #4 fix: "at least 3 positional slots (or *args)" and "no required keyword-only parameters" are now both gated, in addition to the "no more than 3 required positional" rule.
  • New "Manifest tool_sources[].type" paragraph: documents the Literal → str relaxation, the BUILTIN_TOOL_SOURCE_TYPES constant for docs/tools, the BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES rejection at manifest-load (preserving the n8n safety), and the ConfigError (exit 2) dispatch-time fallback for unknown types.

Pure docs change. Ruff clean; public-surface contract tests pass (the existing drift guards already covered cross-doc consistency for the wedge positioning, gating signal, and schema version — none of which this edit touches).

🤖 Generated with Claude Code

The error from `AdapterRegistry.require()` used to say "Add the adapter to
`_register_builtin_adapters()`" — contributor-facing and obsolete now that
`tool_sources[].type` is open `str` and third-party adapters register via
entry points. Agent mode also routed to "edit shipgate.yaml" via
`SHIP-DIAG-INVALID-MANIFEST`, but the manifest itself is valid; the user
needs to install/enable the matching adapter or fix a typo.

Changes:
- `inputs/protocol.py`: rewrite `require()` to branch on
  `_adapter_plugins_enabled(None)` and emit one of two remediations
  (install the adapter package, or enable discovery + install).
- `schemas/diagnostics.py`: add `SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE`.
- `cli/diagnostics.py`: add `diagnose_unknown_adapter_source_type()` that
  produces 2 or 3 ranked next_actions depending on discovery state.
- `cli/_helpers.py`: pattern-match the `require()` error prefix in
  `_diagnose_config_error` and route to the new diagnostic instead of
  `SHIP-DIAG-INVALID-MANIFEST`.
- Regression tests: `test_unknown_adapter_source_type_routes_to_install_enable_diagnostic`
  asserts agent-mode JSON `next_actions[0].kind == "command"` (not "edit")
  with `AGENTS_SHIPGATE_ENABLE_PLUGINS=1`; `test_require_error_message_is_user_facing`
  pins the error string away from the contributor-facing wording.
- `docs/diagnostics.md`: add catalog entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pengfei-threemoonslab
Copy link
Copy Markdown
Contributor Author

Round-4 review fix: user-facing unknown-adapter remediation

[P2] The unknown adapter error now gives the wrong recovery path for the new third-party adapter flow. Since tool_sources[].type is now open str for custom adapter names at [manifest.py:101], a valid third-party manifest can reach AdapterRegistry.require() when the adapter package is missing or plugins are not enabled. The error at [protocol.py:197] still says to add the adapter to _register_builtin_adapters(), and agent mode routes to "edit shipgate.yaml." That is wrong for the new public extension path; it should mention installing/enabling the third-party adapter, AGENTS_SHIPGATE_ENABLE_PLUGINS=1, or fixing a typo.

Fixed in 92c4488. The fix is two-layer:

1. AdapterRegistry.require() now emits a user-facing message. It branches on _adapter_plugins_enabled(None):

  • plugins enabled → "install the third-party adapter package" / "fix a typo"
  • plugins disabled → "set AGENTS_SHIPGATE_ENABLE_PLUGINS=1" / "install the package" / "fix a typo"

No more "Add the adapter to _register_builtin_adapters()".

2. Agent mode no longer routes to "edit shipgate.yaml". _diagnose_config_error in cli/_helpers.py now pattern-matches the require() error prefix and routes to a new diagnostic, SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE, instead of falling through to SHIP-DIAG-INVALID-MANIFEST. The new diagnostic emits 2 (plugins enabled) or 3 (plugins disabled) ranked next_actions, with the rank-1 being:

  • plugins enabled: kind=command, command="pip install <third-party-adapter-package>"
  • plugins disabled: kind=command, command="AGENTS_SHIPGATE_ENABLE_PLUGINS=1 agents-shipgate scan -c <path>"

The manifest is valid; kind=edit is rank-N (only for the typo case), never rank-1.

Regression tests in tests/test_adapter_entry_point_discovery.py:

  • test_unknown_adapter_source_type_routes_to_install_enable_diagnostic — drives a scan with type: not_a_real_adapter, captures the agent-mode error JSON on stderr, asserts next_actions[0].kind == "command", the command contains AGENTS_SHIPGATE_ENABLE_PLUGINS=1, and the diagnostic id is SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE (not SHIP-DIAG-INVALID-MANIFEST).
  • test_require_error_message_is_user_facing — pins str(exc) to not contain _register_builtin_adapters and to contain AGENTS_SHIPGATE_ENABLE_PLUGINS and "typo".

Docs: docs/diagnostics.md gets a catalog row for the new ID, marked v0.20+.

Verification: 1772 passed, 4 skipped; ruff clean.

@pengfei-threemoonslab pengfei-threemoonslab merged commit ef1b59c into main May 22, 2026
1 check passed
@pengfei-threemoonslab pengfei-threemoonslab deleted the feat/v020-adapter-entry-points branch May 22, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant