Add third-party adapter entry-point discovery (v0.20 E4) by pengfei-threemoonslab · Pull Request #111 · ThreeMoonsLab/agents-shipgate

pengfei-threemoonslab · 2026-05-22T20:11:43Z

Summary

Closes E4 from the round-3 architecture review (Nov/Dec 2025). Opens the same extension surface for adapters (input loaders) that M5 already opened for plugin checks. Adapters now register through the agents_shipgate.adapters entry-point group, gated by the existing AGENTS_SHIPGATE_ENABLE_PLUGINS=1 env var and --no-plugins CLI flag.

Why

Third parties that wanted to ship a new input adapter (AutoGen, LlamaIndex, Semantic Kernel, additional ADK languages) had no path: the dispatcher's REGISTRY was builtin-only. The round-3 review noted M5 plugin validation made adapter discovery safe to land — this PR is that follow-up.

How

Four load-time gates in a new inputs/adapter_validation.py (parallel to checks/plugin_validation.py):

load_failed — entry_point.load() raised.
bad_protocol — missing ClassVars, empty source_type, or wrong load() arity.
bad_scope — scope not in {per_source, per_scan}.
source_type_collision — load-bearing trust rule. Adapter's source_type cannot shadow a built-in (mcp, openapi, langchain, …) or another third-party adapter registered earlier in the same scan. Without this rule, a malicious plugin could displace mcp and intercept every scan that loads an MCP source.

discover_third_party_adapters(registry, *, plugins_enabled, loaded_adapters) in inputs/protocol.py walks entry_points("agents_shipgate.adapters"), validates each, registers the valid ones, and appends per-entry info dicts to loaded_adapters[]. Discovery runs before _load_sources so the dispatcher resolves third-party source types alongside built-ins.

Report surface

New ReadinessReport.loaded_adapters: list[dict[str, Any]] field — parallel shape to loaded_plugins[], additive within the existing v0.20 schema. Required + present on every emitted scan; empty list when --no-plugins or no third-party adapters installed. Schema generator pins per-item required fields (name, value, distribution, version, source_type, validation_status, validation_errors, runtime_errors).

`--strict-plugins` (extended)

--strict-plugins now elevates exit code 4 on EITHER plugin failures (M5) OR adapter failures (this PR). The human-readable lines tag each failure as plugin or adapter. --no-plugins help text updated to mention adapter discovery is also disabled.

Runtime safety net

run_validated_adapter in inputs/adapter_validation.py wraps adapter.load() and captures (a) exceptions, (b) wrong return type, (c) artifact-class smuggling — adapter declares artifact_class=X but returns an artifact of another type. Mirrors the Finding.check_id smuggling rule from plugin_validation. The dispatcher's existing _absorb artifact-class check fires TypeError for built-ins; this wrapper is opt-in for future adapter-execution paths.

Test plan

21 new tests in tests/test_adapter_entry_point_discovery.py:
- valid third-party adapter end-to-end (class form, instance form)
- each of the four gates exercised in isolation
- source_type_collision parametrized across 4 built-in source types
- source_type_collision between two third parties
- gating by AGENTS_SHIPGATE_ENABLE_PLUGINS env (default off)
- gating by --no-plugins overriding env (forces off)
- --strict-plugins exits 4 end-to-end via CliRunner
- run_validated_adapter catches exceptions, rejects wrong return type, rejects smuggled artifact
Full suite 1257 passing; ruff clean; coverage 88.00%
scripts/generate_schemas.py --check exits 0
STABILITY.md "Third-party adapter discovery (v0.20+)" subsection added under Trust-model invariants
tests/test_schema_boundaries.py wire-field-order pin updated for the new field
docs/report-sensitive-fields.json inventory extended for the privacy redaction layer

Out of scope (intentional)

Runtime wrapping in the dispatcher. The current _load_sources invokes adapter.load() directly. Wrapping every adapter with run_validated_adapter would change the contract for built-ins too; the wrapper is shipped as opt-in infrastructure for future adapter-execution work. The four load-time gates + the existing _absorb artifact-class check already cover the trust-bypass cases.
Sample-fixture regeneration. The 4 samples/*/expected/report.json files don't yet carry loaded_adapters: []. The sample-report tests only pin report_schema_version + release_decision.decision, so they pass; the next fixture regeneration will pick up the new field naturally.

🤖 Generated with Claude Code

Closes E4 from the round-3 architecture review: opens the same extension surface for adapters (input loaders) that M5 already opened for check plugins. Discovery is gated by the existing AGENTS_SHIPGATE_ENABLE_PLUGINS=1 env var and --no-plugins CLI flag, so a single --no-plugins blocks both classes of third-party extensions. Entry-point group: ``agents_shipgate.adapters``. A third-party package declares an adapter class (or instance) under ``[project.entry-points."agents_shipgate.adapters"]``; the class must satisfy the ToolSourceAdapter Protocol — ``source_type``, ``scope`` (per_source or per_scan), ``artifact_class`` ClassVars and a ``load(source, base_dir, manifest)`` method returning LoadedAdapterResult. Four load-time gates in ``inputs/adapter_validation.py``: 1. ``load_failed`` — entry_point.load() raised. 2. ``bad_protocol`` — loaded value not instantiable, missing required ClassVars, source_type not a non-empty string, or load method has wrong arity. 3. ``bad_scope`` — scope not in {per_source, per_scan}. 4. ``source_type_collision`` — source_type shadows a built-in (mcp, openapi, langchain, etc.) or another already-registered third-party adapter. This is the load-bearing trust rule — without it a malicious plugin could displace ``mcp`` and intercept every scan that loads an MCP source. New top-level discovery API in inputs/protocol.py: ``discover_third_party_adapters(registry, *, plugins_enabled, loaded_adapters)`` walks entry_points("agents_shipgate.adapters"), validates each entry, registers the valid ones onto the supplied registry, and appends per-entry info dicts to loaded_adapters[]. Invalid entries are NOT registered but still appear in loaded_adapters[] so reviewers see what was skipped. Report wiring (v0.20 additive within the existing v0.20 schema): - New ``ReadinessReport.loaded_adapters: list[dict[str, Any]]`` field parallel to loaded_plugins[]. Required + present on every emitted scan (empty list when --no-plugins or no third-party adapters installed). Item required fields pinned at the schema level: name, value, distribution, version, source_type, validation_status, validation_errors, runtime_errors. - Threaded through ``_LoadedInputs``, ``_SanitizedSurfaces``, and ``build_report``. Sanitization runs through redact_data with path="loaded_adapters[]" (entry-point ``value`` strings are first-party and don't carry secrets, but the audit envelope flows through redaction for forward-compat with future adapter- emitted fields). - ``cli/scan.py:_load_inputs`` now accepts plugins_enabled and invokes discovery BEFORE _load_sources so the dispatcher resolves third-party source_types alongside built-ins. --strict-plugins extended (v0.20): The existing --strict-plugins flag now also exits 4 on any non-``valid`` loaded_adapters[] row or non-empty runtime_errors. ``_apply_strict_plugins`` in cli/_helpers.py concatenates plugin and adapter failure messages and tags them ``plugin``/``adapter`` respectively in the human-readable lines. --no-plugins help text updated to mention adapter discovery is also disabled. Runtime safety net: ``run_validated_adapter`` in inputs/adapter_validation.py wraps adapter.load() and captures: exceptions, wrong return type, and artifact-class smuggling (declared artifact_class but returned artifact of another type). Mirrors the Finding.check_id smuggling rule from plugin_validation. The existing dispatcher _absorb artifact-class check fires TypeError for built-ins; this wrapper is opt-in for future adapter-execution paths. STABILITY.md gains "Third-party adapter discovery (v0.20+)" subsection documenting the four gates + the load-bearing source_type_collision rule. Tests (21 new in tests/test_adapter_entry_point_discovery.py): - valid third-party adapter end-to-end (class form, instance form) - each of the four gates exercised in isolation - source_type_collision parametrized across 4 built-in source types (mcp, openapi, langchain, validation) - source_type_collision between two third parties - gating by AGENTS_SHIPGATE_ENABLE_PLUGINS env (default off) - gating by --no-plugins overriding env (forces off) - strict-plugins exits 4 end-to-end via CliRunner with the broken-adapter shape - run_validated_adapter catches exceptions, rejects wrong return type, rejects smuggled artifact Verification: 1257 tests pass; ruff clean; coverage 88.00%; schema roundtrip --check exits 0; sample-report parity preserved (the new loaded_adapters: [] field shows up in fresh scans but the test_sample_expected_report_json_is_current pins only report_schema_version + release_decision.decision, so existing goldens stay valid). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… signature gate Four findings from PR #111 review — three P1s and one P2. All four addressed; 7 regression tests added (now 28 total in test_adapter_entry_point_discovery.py). P1 #1 — --no-plugins did not disable adapters after a prior enabled scan. Pre-fix, ``discover_third_party_adapters`` mutated the module-global ``REGISTRY``. A later in-process scan with ``plugins_enabled=False`` skipped discovery but ``_load_sources`` still resolved the already-registered adapter through that same global. Reproduced: ``report.loaded_adapters == []`` on scan two but the adapter still ran. Trust boundary violated. P1 #2 — repeated scans misclassified stable third-party adapters as ``source_type_collision``. Same root cause: collision detection used ``registry._adapters.keys()`` AFTER scan-one's third-party registration had mutated the global. Scan two thus saw the same adapter as a "built-in" collision while still executing it. Fix for both: per-scan registry clone. - New ``AdapterRegistry.clone()`` returns a shallow-copy registry with a fresh ``_adapters`` dict. Subsequent ``register()`` calls on the clone do NOT propagate to the global. - ``_load_inputs`` now builds a per-scan registry via ``scan_registry = REGISTRY.clone()``, runs discovery against ``scan_registry``, and threads ``scan_registry`` through to ``_load_sources`` via a new ``registry=`` kwarg. - ``_load_sources`` accepts the per-scan registry, falling back to the global only for legacy callers that bypass ``_load_inputs`` (existing ``tests/test_adapter_registry.py`` patterns). - Global ``REGISTRY`` stays builtin-only across the process. Tests that monkeypatch ``REGISTRY._adapters`` keep working because the clone captures the global's state at scan-start (after monkeypatch). Two new regression tests pin the fix: - ``test_no_plugins_disables_third_party_after_prior_enabled_scan``: scan-one runs with plugins enabled (adapter runs once); scan-two in the same process with ``plugins_enabled=False`` must see ``loaded_adapters == []`` AND the adapter must NOT run a second time. Pre-fix the invocation counter incremented; post-fix it stays at 1. - ``test_second_scan_does_not_misclassify_third_party_as_collision``: two consecutive scans with plugins enabled must both report ``validation_status == "valid"`` on the same adapter. Pre-fix scan two reported ``source_type_collision``. The autouse ``_reset_registry_after_test`` fixture is no longer needed — discovery doesn't mutate the global — and is replaced with a minimal ``_populate_registry`` fixture that just triggers lazy population. P1 #3 — per_source third-party adapters could not be referenced in shipgate.yaml. ``ToolSourceConfig.type`` was a closed ``Literal`` of built-in source types, so a manifest declaring ``type: my_third_party_source`` failed Pydantic validation BEFORE discovery had a chance to register the loader. The v0.20 third-party adapter surface was unusable for its main advertised use case. Fix: - ``ToolSourceConfig.type`` relaxed from ``Literal[...]`` to ``str``. - New module-level constant ``BUILTIN_TOOL_SOURCE_TYPES`` enumerates the seven built-in source types for documentation and for tests that need to iterate the curated set. - New constant ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` ({n8n, openai_api, anthropic_api, validation}) preserves the existing rejection: those four built-ins are per_scan-only and ingest config from a dedicated top-level manifest section, so they MUST NOT appear in ``tool_sources``. A new ``reject_per_scan_only_builtins`` model validator on ``ToolSourceConfig`` raises ``ValueError`` at manifest-load with a routable error pointing the user to the top-level section. - Unknown source types (typos like ``n8N``, genuinely-third-party types without a registered adapter) still fail at scan time with ``ConfigError`` from ``AdapterRegistry.require``. Exit-2 contract unchanged. - ``tests/test_n8n.py::test_n8n_top_level_config_is_accepted_and_tool_source_type_is_rejected`` updated: the case-variation (``type: n8N``) assertion now uses ``run_scan`` (dispatcher-time error) instead of ``load_manifest`` (parse-time error). Same exit code, different layer. - ``tests/test_adapter_registry.py`` + ``tests/harness/test_overlay_renderer.py`` updated to iterate ``BUILTIN_TOOL_SOURCE_TYPES`` instead of ``get_args(ToolSourceConfig.type)``. - New regression test ``test_per_source_third_party_adapter_referenced_from_manifest``: builds a manifest with ``type: demo_source``, registers a matching third-party per_source adapter via entry-points, asserts ``run_scan`` succeeds and the adapter actually runs. The JSON manifest schema (``docs/manifest-v0.1.json``) loses the ``enum`` constraint on ``tool_sources[].type`` — schema is now ``{"type": "string"}``. IDE autocomplete for built-in names is a follow-up; the trust gate is the dispatcher, not the schema. P2 #4 — signature validator accepted methods that crash the dispatcher. ``_protocol_error`` rejected only >3 required positional and zero positional. Two failure modes slipped through: - ``def load(self, source)`` — only 1 positional slot; dispatcher's ``adapter.load(source, base_dir, manifest)`` call fails with ``TypeError: too many positional arguments``. - ``def load(self, source, base_dir, manifest, *, must_set)`` — required keyword-only param; dispatcher passes no kwargs. Both now caught at the ``bad_protocol`` gate before registration. ``_protocol_error`` now: - Computes ``accepting_slots`` = count of (required + optional) positional, and ``has_var_positional`` = whether ``*args`` is present. - Rejects when ``accepting_slots < 3 AND not has_var_positional``. - Rejects any required keyword-only parameter (mirrors plugin validation). Four new regression tests: - ``test_gate_bad_protocol_load_too_few_positional``: ``load(self, source)`` lands in ``bad_protocol`` with "at least 3 positional" in the error. - ``test_gate_bad_protocol_load_required_keyword_only``: required kw-only lands in ``bad_protocol`` with "keyword-only" + the param name in the error. - ``test_gate_bad_protocol_accepts_var_positional``: ``load(self, *args)`` is valid (can bind 3 args). - ``test_gate_bad_protocol_accepts_optional_keyword_only``: optional kw-only with default is valid. Verification: 1264 tests pass; ruff clean; schema roundtrip --check exits 0; coverage 88.02%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-22T20:42:32Z

All four findings addressed in commit 14b943a. Three P1s required real structural changes; the P2 was a tightening. 7 new regression tests pin each fix.

P1 #1 + P1 #2 — Per-scan registry (was: global mutation)

You're right — mutating the module-global REGISTRY broke both the --no-plugins trust boundary AND collision detection across consecutive scans. Reproduced both your test cases.

Fix: new AdapterRegistry.clone() returns a shallow-copy registry with a fresh _adapters dict. _load_inputs now builds scan_registry = REGISTRY.clone() and threads it through _load_sources(..., registry=scan_registry). Discovery registers ONLY onto the clone; the global stays builtin-only across the process. Existing tests that monkeypatch REGISTRY._adapters keep working because clone() captures the global's state at the moment _load_inputs runs (after monkeypatch).

Two regression tests pin the fix:

test_no_plugins_disables_third_party_after_prior_enabled_scan — scan 1 enables the adapter (runs once); scan 2 in the same process with plugins_enabled=False must NOT increment the invocation counter. Asserts both report.loaded_adapters == [] AND _RecordingAdapter.invocations == 1.
test_second_scan_does_not_misclassify_third_party_as_collision — two consecutive scans of the same adapter both report validation_status == "valid". Pre-fix scan 2 would have reported source_type_collision while still executing the stale registration.

The autouse _reset_registry_after_test fixture I'd added with the original PR is gone — no cleanup needed since discovery doesn't mutate the global.

P1 #3 — `ToolSourceConfig.type` opened from `Literal` to `str`

You're right — the closed Literal made the v0.20 third-party adapter surface unusable for its main advertised use case. A manifest declaring type: my_third_party_source failed Pydantic validation before discovery could register the loader.

Fix:

ToolSourceConfig.type: Literal[...] → type: str.
New module-level BUILTIN_TOOL_SOURCE_TYPES constant for documentation + tests that iterate the curated set.
New BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES = {"n8n", "openai_api", "anthropic_api", "validation"} + a reject_per_scan_only_builtins model validator preserves the safety rejection (your test_n8n_top_level_config_is_accepted_and_tool_source_type_is_rejected still passes for the exact type: n8n case at manifest-load time).
Unknown source types (typos, third-party without a registered adapter) still fail with ConfigError (exit 2) — now at dispatcher time via AdapterRegistry.require, not at parse time. The exit-code contract is unchanged. test_n8n*::n8N case updated to expect the failure at run_scan instead of load_manifest (same exit code, different layer).
New regression test test_per_source_third_party_adapter_referenced_from_manifest: builds a manifest with type: demo_source, registers a matching third-party per_source adapter via entry-points, asserts run_scan succeeds, the adapter actually executes, and report.loaded_adapters[0].validation_status == "valid".

The JSON manifest schema (docs/manifest-v0.1.json) loses the enum constraint on tool_sources[].type as a side effect. IDE autocomplete for built-in names is a follow-up; the trust gate is the dispatcher, not the schema.

P2 #4 — Tighter signature validation

You're right — _protocol_error only caught >3 required positional and zero positional. Two failure modes slipped through and would crash the dispatcher at runtime instead of failing at the bad_protocol gate:

def load(self, source) — only 1 positional slot; dispatcher's adapter.load(source, base_dir, manifest) would fail with TypeError: too many positional arguments.
def load(self, source, base_dir, manifest, *, must_set) — required keyword-only; dispatcher passes no kwargs.

Fix: _protocol_error now requires accepting_slots >= 3 OR has_var_positional, AND zero required keyword-only parameters. Four new regression tests:

test_gate_bad_protocol_load_too_few_positional — your example, expects bad_protocol with "at least 3 positional" in the error.
test_gate_bad_protocol_load_required_keyword_only — required kw-only is rejected; error mentions both "keyword-only" and the offending param name.
test_gate_bad_protocol_accepts_var_positional — def load(self, *args) is valid (can bind 3 args).
test_gate_bad_protocol_accepts_optional_keyword_only — optional kw-only with default is valid.

Verification

1264 tests pass (was 1257; +7 regression tests covering each finding).
ruff clean.
scripts/generate_schemas.py --check exits 0.
Coverage 88.02%.
The four findings reproduce against the previous commit 5400fb2; each is closed against 14b943a.

🤖 Generated with Claude Code

Three more findings from PR #111 review — two P1s and one P2. The v0.20 third-party adapter surface now has parity across all three consumers (scan dispatcher, doctor introspection, markdown report) and the documented lenient/strict contract holds end-to-end. 4 new regression tests, 32 total in test_adapter_entry_point_discovery.py. P1 #1 — adapter runtime failures bypassed loaded_adapters.runtime_errors. The dispatcher in ``_load_sources`` called ``adapter.load(...)`` directly for ALL adapters. ``run_validated_adapter`` existed in ``inputs/adapter_validation.py`` but was never invoked during a real scan. A third-party adapter that raised at runtime aborted ``run_scan`` with the raw exception, no report was emitted, and ``--strict-plugins`` never saw the failure. Contradicted the documented "lenient mode records the failure, continues" contract. Fix: - ``_load_inputs`` now builds a ``third_party_records: dict[str, LoadedAdapter]`` map from discovery (one entry per valid third-party adapter, keyed by source_type) and threads it to ``_load_sources(..., third_party_records=...)``. - ``_load_sources`` (pass 1 / pass 2) checks each adapter's source_type against the map. Built-ins keep the direct ``adapter.load(...)`` call shape — a built-in raising is a scanner bug and must abort loudly. Third-party adapters route through ``run_validated_adapter`` instead, which catches every exception, captures it into ``LoadedAdapter.runtime_errors`` (and mirrors into the matching ``loaded_adapters[].runtime_errors`` dict), and returns ``None`` so the dispatcher skips ``_absorb``. - ``_invoke_per_source_adapter`` learned a ``third_party_record`` kwarg that switches to the wrapper. Built-in optional-source ``InputParseError`` handling is preserved on the built-in branch. Result: lenient-mode third-party crashes are captured, the scan completes, the report contains the failed adapter row with ``runtime_errors`` populated, and ``--strict-plugins`` exits 4 as documented. P1 #2 — doctor couldn't introspect manifests using third-party source types. ``inspect_sources`` (the doctor command's manifest- introspection entry point) called ``_load_sources`` against the global ``REGISTRY`` (which is intentionally builtin-only since the PR #111 P1 #1+#2 per-scan-clone refactor) and did NOT run adapter discovery. So a manifest with ``tool_sources[].type: demo_source`` that scanned cleanly now made ``doctor`` crash with ``ConfigError: No adapter registered for source type 'demo_source'``. Fix: - ``inspect_sources`` mirrors the ``_load_inputs`` pattern: build ``scan_registry = REGISTRY.clone()``, run ``discover_third_party_adapters(scan_registry, …)`` (honoring the same ``plugins_enabled`` env / override semantics), pass the per-scan registry and ``third_party_records`` to ``_load_sources``. - New ``plugins_enabled: bool | None = None`` kwarg defaults to ``None`` so the existing env var governs (no doctor CLI flag change required). - ``loaded_adapters`` surfaced in the doctor payload alongside ``policy_packs`` so an operator can confirm what was discovered without running a full scan. P2 #3 — Markdown reports hid adapter validation failures. ``report/markdown.py`` rendered ``loaded_policy_packs`` and ``loaded_plugins`` but had no equivalent ``loaded_adapters`` section. A ``load_failed`` adapter appeared in ``report.json`` but not in ``report.md`` — the default human report was blind to skipped third-party extensions. Fix: - New ``_append_loaded_adapters(lines, report)`` renders a ``## Loaded Adapters`` section listing each entry as ``- distribution version: source_type — \`validation_status\`` followed by per-error indented bullets (validation_error and runtime_error). Hidden when ``loaded_adapters`` is empty so clean repos see no extra noise. - Wired into the main rendering flow right after ``_append_loaded_plugins``. Regression tests (4 new in tests/test_adapter_entry_point_discovery.py): - ``test_runtime_error_in_third_party_adapter_captured_not_propagated`` — a third-party adapter that raises ``RuntimeError`` at runtime doesn't abort ``run_scan``; the failure lands in ``loaded_adapters[].runtime_errors``; the scan exits 0. - ``test_strict_plugins_exits_on_third_party_adapter_runtime_error`` — end-to-end via CliRunner: ``--strict-plugins`` elevates exit code 4 specifically on the new adapter-runtime-error path. - ``test_doctor_resolves_third_party_source_types`` — a manifest with ``type: demo_source`` and a matching third-party adapter is introspected successfully by ``inspect_sources``; the adapter runs; the doctor payload surfaces ``loaded_adapters[]``. - ``test_markdown_report_renders_loaded_adapters_section`` — a ``load_failed`` adapter shows up in ``report.md`` under the ``## Loaded Adapters`` heading with its status and validation error. Verification: 1268 tests pass; ruff clean; schema roundtrip --check exits 0; coverage 88.06%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-22T21:24:06Z

Round-3 review findings addressed in commit 943afda. Two P1s and one P2; 4 new regression tests pin each.

P1 #1 — Runtime failures bypassed the provenance contract

You're right — run_validated_adapter existed in inputs/adapter_validation.py but was never invoked. The dispatcher in _load_sources called adapter.load(...) directly for everything, so a third-party adapter that raised at runtime aborted run_scan with the raw exception, no report was emitted, and --strict-plugins had no chance to see loaded_adapters[].runtime_errors.

Fix:

_load_inputs now builds a third_party_records: dict[str, LoadedAdapter] map (source_type → record) from discovery and threads it to _load_sources.
_load_sources branches per adapter: built-ins keep the direct call (their failures should abort loudly — a built-in raising is a scanner bug). Third-party adapters route through run_validated_adapter, which captures every exception into LoadedAdapter.runtime_errors (mirrored into the matching loaded_adapters[] dict) and returns None. The dispatcher skips _absorb on None and the scan continues.
_invoke_per_source_adapter learned a third_party_record kwarg that switches to the wrapper.

Two regression tests pin the fix:

test_runtime_error_in_third_party_adapter_captured_not_propagated — a RuntimeError raised in load() no longer crashes run_scan; lands in runtime_errors; exit 0 in lenient mode.
test_strict_plugins_exits_on_third_party_adapter_runtime_error — end-to-end via CliRunner, --strict-plugins exits 4 specifically on the runtime-error path.

P1 #2 — `doctor` couldn't introspect third-party source types

You're right — inspect_sources called _load_sources against the global REGISTRY (now builtin-only after the per-scan-clone refactor) with no discovery. A manifest with tool_sources[].type: demo_source scanned fine but crashed doctor with ConfigError: No adapter registered.

Fix:

inspect_sources now mirrors _load_inputs: REGISTRY.clone() → discover_third_party_adapters(...) → pass the per-scan registry and third_party_records to _load_sources.
New plugins_enabled: bool | None = None kwarg defaults to None so the existing AGENTS_SHIPGATE_ENABLE_PLUGINS env var governs (no doctor CLI flag change needed yet).
loaded_adapters surfaced in the doctor payload alongside policy_packs so operators can confirm discovery results without running a full scan.

Regression test test_doctor_resolves_third_party_source_types builds the exact manifest you described, asserts inspect_sources returns a clean payload, the adapter actually ran, and loaded_adapters[] is surfaced.

P2 #3 — Markdown reports hid adapter validation failures

You're right — report/markdown.py rendered loaded_plugins but had no loaded_adapters section. load_failed adapters appeared only in report.json.

Fix: new _append_loaded_adapters(lines, report) renders a ## Loaded Adapters section. Each entry:

- <distribution> <version>: <source_type> — `<validation_status>`
  - validation_error: <message>
  - runtime_error: <message>

Hidden when loaded_adapters is empty so clean repos see no extra noise. Wired right after _append_loaded_plugins.

Regression test test_markdown_report_renders_loaded_adapters_section reproduces a load_failed adapter and asserts the section appears in report.md with both the status and the validation_error visible (accepting markdown's underscore-escape form so load_failed / load\_failed both satisfy the assertion).

Verification

1268 tests pass (was 1264; +4 regression tests covering the three new findings).
ruff clean.
scripts/generate_schemas.py --check exits 0.
Coverage 88.06%.
All three findings reproduce against commit 14b943a; each is closed against 943afda.

🤖 Generated with Claude Code

…aged runtime wrapper (PR #111 review) Two prose claims in the v0.20 "Third-party adapter discovery" subsection were stale after the round-2 and round-3 review fixes: 1. "register on the live REGISTRY" — wrong since commit 14b943a. Adapters now register on a per-scan ``AdapterRegistry.clone()`` built at the start of each ``run_scan`` / ``inspect_sources``. The global ``REGISTRY`` stays builtin-only across the process. 2. "captures runtime exceptions … for callers that wrap the invocation themselves" — wrong since commit 943afda. The dispatcher in ``_load_sources`` now routes EVERY third-party adapter ``load()`` call through ``run_validated_adapter`` (both pass 1 per-source and pass 2 per-scan loops). Callers do not need to opt in. Rewrote the section to: - Document the per-scan registry contract explicitly and the two invariants it guarantees (``--no-plugins`` honest across scans; collision detection honest across scans). - Document the dispatcher's two trust mechanisms separately: artifact-class smuggling prevention (already covered for built-ins via ``_absorb`` ``TypeError``) and runtime-error capture (new third-party-only path through ``run_validated_adapter``). - Note that ``doctor`` (``inspect_sources``) uses the same per-scan registry + discovery + dispatcher path as ``scan`` and surfaces ``loaded_adapters[]`` in its payload. - Tighten the bad_protocol gate description to match the v0.20 PR #111 P2 #4 fix: "at least 3 positional slots (or *args)" and "no required keyword-only parameters" are now both gated, in addition to the "no more than 3 required positional" rule. - Add a new "Manifest ``tool_sources[].type``" paragraph documenting the Literal→str relaxation, the ``BUILTIN_TOOL_SOURCE_TYPES`` constant for documentation, the ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` rejection at manifest-load, and the ConfigError(2) dispatch-time fallback for unknown types. Pure docs change. ruff clean; public-surface contract tests pass (the existing drift guards already covered cross-doc consistency for the wedge positioning, gating signal, and schema version — none of which this edit touches). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-22T22:37:42Z

Doc fix landed in commit bdbcae1 — STABILITY.md "Third-party adapter discovery (v0.20+)" subsection rewritten to match the code as merged on this branch.

You're right — two prose claims were stale relative to the review fixes:

"register on the live REGISTRY" was wrong since commit 14b943a (the per-scan-clone fix for P1 Bump actions/checkout from 4.3.1 to 6.0.2 #1 + P1 Bump actions/github-script from 7.1.0 to 9.0.0 #2). Adapters now register on a per-scan AdapterRegistry.clone() built at the start of each run_scan / inspect_sources. The global REGISTRY stays builtin-only across the process lifetime.
"captures runtime exceptions … for callers that wrap the invocation themselves" was wrong since commit 943afda (the dispatcher-managed wrapper for P1 Bump actions/checkout from 4.3.1 to 6.0.2 #1 review-round-3). The dispatcher in _load_sources now routes EVERY third-party adapter load() call through run_validated_adapter (both pass-1 per-source and pass-2 per-scan loops). Callers do not opt in.

Rewritten section now documents:

Per-scan registry contract with the two invariants it guarantees (--no-plugins honest across scans; collision detection honest across scans).
Dispatcher trust mechanisms, separated: artifact-class smuggling prevention (existing _absorb TypeError, applies to all adapters) and runtime-error capture (new third-party-only path through run_validated_adapter). Built-ins keep the direct call shape — a built-in raising means the scanner is broken and must abort loudly; this is now stated explicitly.
doctor parity: inspect_sources uses the same per-scan registry + discovery + dispatcher path and surfaces loaded_adapters[] in its payload.
bad_protocol gate description tightened to match the P2 Bump actions/setup-python from 5.6.0 to 6.2.0 #4 fix: "at least 3 positional slots (or *args)" and "no required keyword-only parameters" are now both gated, in addition to the "no more than 3 required positional" rule.
New "Manifest tool_sources[].type" paragraph: documents the Literal → str relaxation, the BUILTIN_TOOL_SOURCE_TYPES constant for docs/tools, the BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES rejection at manifest-load (preserving the n8n safety), and the ConfigError (exit 2) dispatch-time fallback for unknown types.

Pure docs change. Ruff clean; public-surface contract tests pass (the existing drift guards already covered cross-doc consistency for the wedge positioning, gating signal, and schema version — none of which this edit touches).

🤖 Generated with Claude Code

The error from `AdapterRegistry.require()` used to say "Add the adapter to `_register_builtin_adapters()`" — contributor-facing and obsolete now that `tool_sources[].type` is open `str` and third-party adapters register via entry points. Agent mode also routed to "edit shipgate.yaml" via `SHIP-DIAG-INVALID-MANIFEST`, but the manifest itself is valid; the user needs to install/enable the matching adapter or fix a typo. Changes: - `inputs/protocol.py`: rewrite `require()` to branch on `_adapter_plugins_enabled(None)` and emit one of two remediations (install the adapter package, or enable discovery + install). - `schemas/diagnostics.py`: add `SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE`. - `cli/diagnostics.py`: add `diagnose_unknown_adapter_source_type()` that produces 2 or 3 ranked next_actions depending on discovery state. - `cli/_helpers.py`: pattern-match the `require()` error prefix in `_diagnose_config_error` and route to the new diagnostic instead of `SHIP-DIAG-INVALID-MANIFEST`. - Regression tests: `test_unknown_adapter_source_type_routes_to_install_enable_diagnostic` asserts agent-mode JSON `next_actions[0].kind == "command"` (not "edit") with `AGENTS_SHIPGATE_ENABLE_PLUGINS=1`; `test_require_error_message_is_user_facing` pins the error string away from the contributor-facing wording. - `docs/diagnostics.md`: add catalog entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

pengfei-threemoonslab · 2026-05-22T23:05:40Z

Round-4 review fix: user-facing unknown-adapter remediation

[P2] The unknown adapter error now gives the wrong recovery path for the new third-party adapter flow. Since tool_sources[].type is now open str for custom adapter names at [manifest.py:101], a valid third-party manifest can reach AdapterRegistry.require() when the adapter package is missing or plugins are not enabled. The error at [protocol.py:197] still says to add the adapter to _register_builtin_adapters(), and agent mode routes to "edit shipgate.yaml." That is wrong for the new public extension path; it should mention installing/enabling the third-party adapter, AGENTS_SHIPGATE_ENABLE_PLUGINS=1, or fixing a typo.

Fixed in 92c4488. The fix is two-layer:

1. AdapterRegistry.require() now emits a user-facing message. It branches on _adapter_plugins_enabled(None):

plugins enabled → "install the third-party adapter package" / "fix a typo"
plugins disabled → "set AGENTS_SHIPGATE_ENABLE_PLUGINS=1" / "install the package" / "fix a typo"

No more "Add the adapter to _register_builtin_adapters()".

2. Agent mode no longer routes to "edit shipgate.yaml". _diagnose_config_error in cli/_helpers.py now pattern-matches the require() error prefix and routes to a new diagnostic, SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE, instead of falling through to SHIP-DIAG-INVALID-MANIFEST. The new diagnostic emits 2 (plugins enabled) or 3 (plugins disabled) ranked next_actions, with the rank-1 being:

plugins enabled: kind=command, command="pip install <third-party-adapter-package>"
plugins disabled: kind=command, command="AGENTS_SHIPGATE_ENABLE_PLUGINS=1 agents-shipgate scan -c <path>"

The manifest is valid; kind=edit is rank-N (only for the typo case), never rank-1.

Regression tests in tests/test_adapter_entry_point_discovery.py:

test_unknown_adapter_source_type_routes_to_install_enable_diagnostic — drives a scan with type: not_a_real_adapter, captures the agent-mode error JSON on stderr, asserts next_actions[0].kind == "command", the command contains AGENTS_SHIPGATE_ENABLE_PLUGINS=1, and the diagnostic id is SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE (not SHIP-DIAG-INVALID-MANIFEST).
test_require_error_message_is_user_facing — pins str(exc) to not contain _register_builtin_adapters and to contain AGENTS_SHIPGATE_ENABLE_PLUGINS and "typo".

Docs: docs/diagnostics.md gets a catalog row for the new ID, marked v0.20+.

Verification: 1772 passed, 4 skipped; ruff clean.

pengfei-threemoonslab and others added 2 commits May 22, 2026 13:11

Address PR #111 review round 5 diagnostics

90c9d86

pengfei-threemoonslab merged commit ef1b59c into main May 22, 2026
1 check passed

pengfei-threemoonslab deleted the feat/v020-adapter-entry-points branch May 22, 2026 23:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add third-party adapter entry-point discovery (v0.20 E4)#111

Add third-party adapter entry-point discovery (v0.20 E4)#111
pengfei-threemoonslab merged 6 commits into
mainfrom
feat/v020-adapter-entry-points

pengfei-threemoonslab commented May 22, 2026

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pengfei-threemoonslab commented May 22, 2026

Summary

Why

How

Report surface

--strict-plugins (extended)

Runtime safety net

Test plan

Out of scope (intentional)

Uh oh!

pengfei-threemoonslab commented May 22, 2026

P1 #1 + P1 #2 — Per-scan registry (was: global mutation)

P1 #3 — ToolSourceConfig.type opened from Literal to str

P2 #4 — Tighter signature validation

Verification

Uh oh!

pengfei-threemoonslab commented May 22, 2026

P1 #1 — Runtime failures bypassed the provenance contract

P1 #2 — doctor couldn't introspect third-party source types

P2 #3 — Markdown reports hid adapter validation failures

Verification

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Uh oh!

pengfei-threemoonslab commented May 22, 2026

Round-4 review fix: user-facing unknown-adapter remediation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`--strict-plugins` (extended)

P1 #3 — `ToolSourceConfig.type` opened from `Literal` to `str`

P1 #2 — `doctor` couldn't introspect third-party source types