Add third-party adapter entry-point discovery (v0.20 E4)#111
Conversation
Closes E4 from the round-3 architecture review: opens the same
extension surface for adapters (input loaders) that M5 already
opened for check plugins. Discovery is gated by the existing
AGENTS_SHIPGATE_ENABLE_PLUGINS=1 env var and --no-plugins CLI flag,
so a single --no-plugins blocks both classes of third-party
extensions.
Entry-point group: ``agents_shipgate.adapters``. A third-party
package declares an adapter class (or instance) under
``[project.entry-points."agents_shipgate.adapters"]``; the class
must satisfy the ToolSourceAdapter Protocol — ``source_type``,
``scope`` (per_source or per_scan), ``artifact_class`` ClassVars
and a ``load(source, base_dir, manifest)`` method returning
LoadedAdapterResult.
Four load-time gates in ``inputs/adapter_validation.py``:
1. ``load_failed`` — entry_point.load() raised.
2. ``bad_protocol`` — loaded value not instantiable, missing
required ClassVars, source_type not a non-empty string, or load
method has wrong arity.
3. ``bad_scope`` — scope not in {per_source, per_scan}.
4. ``source_type_collision`` — source_type shadows a built-in
(mcp, openapi, langchain, etc.) or another already-registered
third-party adapter. This is the load-bearing trust rule —
without it a malicious plugin could displace ``mcp`` and
intercept every scan that loads an MCP source.
New top-level discovery API in inputs/protocol.py:
``discover_third_party_adapters(registry, *, plugins_enabled,
loaded_adapters)`` walks entry_points("agents_shipgate.adapters"),
validates each entry, registers the valid ones onto the supplied
registry, and appends per-entry info dicts to loaded_adapters[].
Invalid entries are NOT registered but still appear in
loaded_adapters[] so reviewers see what was skipped.
Report wiring (v0.20 additive within the existing v0.20 schema):
- New ``ReadinessReport.loaded_adapters: list[dict[str, Any]]``
field parallel to loaded_plugins[]. Required + present on every
emitted scan (empty list when --no-plugins or no third-party
adapters installed). Item required fields pinned at the schema
level: name, value, distribution, version, source_type,
validation_status, validation_errors, runtime_errors.
- Threaded through ``_LoadedInputs``, ``_SanitizedSurfaces``, and
``build_report``. Sanitization runs through redact_data with
path="loaded_adapters[]" (entry-point ``value`` strings are
first-party and don't carry secrets, but the audit envelope
flows through redaction for forward-compat with future adapter-
emitted fields).
- ``cli/scan.py:_load_inputs`` now accepts plugins_enabled and
invokes discovery BEFORE _load_sources so the dispatcher
resolves third-party source_types alongside built-ins.
--strict-plugins extended (v0.20):
The existing --strict-plugins flag now also exits 4 on any
non-``valid`` loaded_adapters[] row or non-empty runtime_errors.
``_apply_strict_plugins`` in cli/_helpers.py concatenates plugin
and adapter failure messages and tags them ``plugin``/``adapter``
respectively in the human-readable lines. --no-plugins help text
updated to mention adapter discovery is also disabled.
Runtime safety net:
``run_validated_adapter`` in inputs/adapter_validation.py wraps
adapter.load() and captures: exceptions, wrong return type, and
artifact-class smuggling (declared artifact_class but returned
artifact of another type). Mirrors the Finding.check_id smuggling
rule from plugin_validation. The existing dispatcher _absorb
artifact-class check fires TypeError for built-ins; this wrapper
is opt-in for future adapter-execution paths.
STABILITY.md gains "Third-party adapter discovery (v0.20+)"
subsection documenting the four gates + the load-bearing
source_type_collision rule.
Tests (21 new in tests/test_adapter_entry_point_discovery.py):
- valid third-party adapter end-to-end (class form, instance form)
- each of the four gates exercised in isolation
- source_type_collision parametrized across 4 built-in source
types (mcp, openapi, langchain, validation)
- source_type_collision between two third parties
- gating by AGENTS_SHIPGATE_ENABLE_PLUGINS env (default off)
- gating by --no-plugins overriding env (forces off)
- strict-plugins exits 4 end-to-end via CliRunner with
the broken-adapter shape
- run_validated_adapter catches exceptions, rejects wrong
return type, rejects smuggled artifact
Verification: 1257 tests pass; ruff clean; coverage 88.00%; schema
roundtrip --check exits 0; sample-report parity preserved (the
new loaded_adapters: [] field shows up in fresh scans but the
test_sample_expected_report_json_is_current pins only
report_schema_version + release_decision.decision, so existing
goldens stay valid).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… signature gate Four findings from PR #111 review — three P1s and one P2. All four addressed; 7 regression tests added (now 28 total in test_adapter_entry_point_discovery.py). P1 #1 — --no-plugins did not disable adapters after a prior enabled scan. Pre-fix, ``discover_third_party_adapters`` mutated the module-global ``REGISTRY``. A later in-process scan with ``plugins_enabled=False`` skipped discovery but ``_load_sources`` still resolved the already-registered adapter through that same global. Reproduced: ``report.loaded_adapters == []`` on scan two but the adapter still ran. Trust boundary violated. P1 #2 — repeated scans misclassified stable third-party adapters as ``source_type_collision``. Same root cause: collision detection used ``registry._adapters.keys()`` AFTER scan-one's third-party registration had mutated the global. Scan two thus saw the same adapter as a "built-in" collision while still executing it. Fix for both: per-scan registry clone. - New ``AdapterRegistry.clone()`` returns a shallow-copy registry with a fresh ``_adapters`` dict. Subsequent ``register()`` calls on the clone do NOT propagate to the global. - ``_load_inputs`` now builds a per-scan registry via ``scan_registry = REGISTRY.clone()``, runs discovery against ``scan_registry``, and threads ``scan_registry`` through to ``_load_sources`` via a new ``registry=`` kwarg. - ``_load_sources`` accepts the per-scan registry, falling back to the global only for legacy callers that bypass ``_load_inputs`` (existing ``tests/test_adapter_registry.py`` patterns). - Global ``REGISTRY`` stays builtin-only across the process. Tests that monkeypatch ``REGISTRY._adapters`` keep working because the clone captures the global's state at scan-start (after monkeypatch). Two new regression tests pin the fix: - ``test_no_plugins_disables_third_party_after_prior_enabled_scan``: scan-one runs with plugins enabled (adapter runs once); scan-two in the same process with ``plugins_enabled=False`` must see ``loaded_adapters == []`` AND the adapter must NOT run a second time. Pre-fix the invocation counter incremented; post-fix it stays at 1. - ``test_second_scan_does_not_misclassify_third_party_as_collision``: two consecutive scans with plugins enabled must both report ``validation_status == "valid"`` on the same adapter. Pre-fix scan two reported ``source_type_collision``. The autouse ``_reset_registry_after_test`` fixture is no longer needed — discovery doesn't mutate the global — and is replaced with a minimal ``_populate_registry`` fixture that just triggers lazy population. P1 #3 — per_source third-party adapters could not be referenced in shipgate.yaml. ``ToolSourceConfig.type`` was a closed ``Literal`` of built-in source types, so a manifest declaring ``type: my_third_party_source`` failed Pydantic validation BEFORE discovery had a chance to register the loader. The v0.20 third-party adapter surface was unusable for its main advertised use case. Fix: - ``ToolSourceConfig.type`` relaxed from ``Literal[...]`` to ``str``. - New module-level constant ``BUILTIN_TOOL_SOURCE_TYPES`` enumerates the seven built-in source types for documentation and for tests that need to iterate the curated set. - New constant ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` ({n8n, openai_api, anthropic_api, validation}) preserves the existing rejection: those four built-ins are per_scan-only and ingest config from a dedicated top-level manifest section, so they MUST NOT appear in ``tool_sources``. A new ``reject_per_scan_only_builtins`` model validator on ``ToolSourceConfig`` raises ``ValueError`` at manifest-load with a routable error pointing the user to the top-level section. - Unknown source types (typos like ``n8N``, genuinely-third-party types without a registered adapter) still fail at scan time with ``ConfigError`` from ``AdapterRegistry.require``. Exit-2 contract unchanged. - ``tests/test_n8n.py::test_n8n_top_level_config_is_accepted_and_tool_source_type_is_rejected`` updated: the case-variation (``type: n8N``) assertion now uses ``run_scan`` (dispatcher-time error) instead of ``load_manifest`` (parse-time error). Same exit code, different layer. - ``tests/test_adapter_registry.py`` + ``tests/harness/test_overlay_renderer.py`` updated to iterate ``BUILTIN_TOOL_SOURCE_TYPES`` instead of ``get_args(ToolSourceConfig.type)``. - New regression test ``test_per_source_third_party_adapter_referenced_from_manifest``: builds a manifest with ``type: demo_source``, registers a matching third-party per_source adapter via entry-points, asserts ``run_scan`` succeeds and the adapter actually runs. The JSON manifest schema (``docs/manifest-v0.1.json``) loses the ``enum`` constraint on ``tool_sources[].type`` — schema is now ``{"type": "string"}``. IDE autocomplete for built-in names is a follow-up; the trust gate is the dispatcher, not the schema. P2 #4 — signature validator accepted methods that crash the dispatcher. ``_protocol_error`` rejected only >3 required positional and zero positional. Two failure modes slipped through: - ``def load(self, source)`` — only 1 positional slot; dispatcher's ``adapter.load(source, base_dir, manifest)`` call fails with ``TypeError: too many positional arguments``. - ``def load(self, source, base_dir, manifest, *, must_set)`` — required keyword-only param; dispatcher passes no kwargs. Both now caught at the ``bad_protocol`` gate before registration. ``_protocol_error`` now: - Computes ``accepting_slots`` = count of (required + optional) positional, and ``has_var_positional`` = whether ``*args`` is present. - Rejects when ``accepting_slots < 3 AND not has_var_positional``. - Rejects any required keyword-only parameter (mirrors plugin validation). Four new regression tests: - ``test_gate_bad_protocol_load_too_few_positional``: ``load(self, source)`` lands in ``bad_protocol`` with "at least 3 positional" in the error. - ``test_gate_bad_protocol_load_required_keyword_only``: required kw-only lands in ``bad_protocol`` with "keyword-only" + the param name in the error. - ``test_gate_bad_protocol_accepts_var_positional``: ``load(self, *args)`` is valid (can bind 3 args). - ``test_gate_bad_protocol_accepts_optional_keyword_only``: optional kw-only with default is valid. Verification: 1264 tests pass; ruff clean; schema roundtrip --check exits 0; coverage 88.02%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
All four findings addressed in commit P1 #1 + P1 #2 — Per-scan registry (was: global mutation)You're right — mutating the module-global Fix: new Two regression tests pin the fix:
The autouse P1 #3 —
|
Three more findings from PR #111 review — two P1s and one P2. The v0.20 third-party adapter surface now has parity across all three consumers (scan dispatcher, doctor introspection, markdown report) and the documented lenient/strict contract holds end-to-end. 4 new regression tests, 32 total in test_adapter_entry_point_discovery.py. P1 #1 — adapter runtime failures bypassed loaded_adapters.runtime_errors. The dispatcher in ``_load_sources`` called ``adapter.load(...)`` directly for ALL adapters. ``run_validated_adapter`` existed in ``inputs/adapter_validation.py`` but was never invoked during a real scan. A third-party adapter that raised at runtime aborted ``run_scan`` with the raw exception, no report was emitted, and ``--strict-plugins`` never saw the failure. Contradicted the documented "lenient mode records the failure, continues" contract. Fix: - ``_load_inputs`` now builds a ``third_party_records: dict[str, LoadedAdapter]`` map from discovery (one entry per valid third-party adapter, keyed by source_type) and threads it to ``_load_sources(..., third_party_records=...)``. - ``_load_sources`` (pass 1 / pass 2) checks each adapter's source_type against the map. Built-ins keep the direct ``adapter.load(...)`` call shape — a built-in raising is a scanner bug and must abort loudly. Third-party adapters route through ``run_validated_adapter`` instead, which catches every exception, captures it into ``LoadedAdapter.runtime_errors`` (and mirrors into the matching ``loaded_adapters[].runtime_errors`` dict), and returns ``None`` so the dispatcher skips ``_absorb``. - ``_invoke_per_source_adapter`` learned a ``third_party_record`` kwarg that switches to the wrapper. Built-in optional-source ``InputParseError`` handling is preserved on the built-in branch. Result: lenient-mode third-party crashes are captured, the scan completes, the report contains the failed adapter row with ``runtime_errors`` populated, and ``--strict-plugins`` exits 4 as documented. P1 #2 — doctor couldn't introspect manifests using third-party source types. ``inspect_sources`` (the doctor command's manifest- introspection entry point) called ``_load_sources`` against the global ``REGISTRY`` (which is intentionally builtin-only since the PR #111 P1 #1+#2 per-scan-clone refactor) and did NOT run adapter discovery. So a manifest with ``tool_sources[].type: demo_source`` that scanned cleanly now made ``doctor`` crash with ``ConfigError: No adapter registered for source type 'demo_source'``. Fix: - ``inspect_sources`` mirrors the ``_load_inputs`` pattern: build ``scan_registry = REGISTRY.clone()``, run ``discover_third_party_adapters(scan_registry, …)`` (honoring the same ``plugins_enabled`` env / override semantics), pass the per-scan registry and ``third_party_records`` to ``_load_sources``. - New ``plugins_enabled: bool | None = None`` kwarg defaults to ``None`` so the existing env var governs (no doctor CLI flag change required). - ``loaded_adapters`` surfaced in the doctor payload alongside ``policy_packs`` so an operator can confirm what was discovered without running a full scan. P2 #3 — Markdown reports hid adapter validation failures. ``report/markdown.py`` rendered ``loaded_policy_packs`` and ``loaded_plugins`` but had no equivalent ``loaded_adapters`` section. A ``load_failed`` adapter appeared in ``report.json`` but not in ``report.md`` — the default human report was blind to skipped third-party extensions. Fix: - New ``_append_loaded_adapters(lines, report)`` renders a ``## Loaded Adapters`` section listing each entry as ``- distribution version: source_type — \`validation_status\`` followed by per-error indented bullets (validation_error and runtime_error). Hidden when ``loaded_adapters`` is empty so clean repos see no extra noise. - Wired into the main rendering flow right after ``_append_loaded_plugins``. Regression tests (4 new in tests/test_adapter_entry_point_discovery.py): - ``test_runtime_error_in_third_party_adapter_captured_not_propagated`` — a third-party adapter that raises ``RuntimeError`` at runtime doesn't abort ``run_scan``; the failure lands in ``loaded_adapters[].runtime_errors``; the scan exits 0. - ``test_strict_plugins_exits_on_third_party_adapter_runtime_error`` — end-to-end via CliRunner: ``--strict-plugins`` elevates exit code 4 specifically on the new adapter-runtime-error path. - ``test_doctor_resolves_third_party_source_types`` — a manifest with ``type: demo_source`` and a matching third-party adapter is introspected successfully by ``inspect_sources``; the adapter runs; the doctor payload surfaces ``loaded_adapters[]``. - ``test_markdown_report_renders_loaded_adapters_section`` — a ``load_failed`` adapter shows up in ``report.md`` under the ``## Loaded Adapters`` heading with its status and validation error. Verification: 1268 tests pass; ruff clean; schema roundtrip --check exits 0; coverage 88.06%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Round-3 review findings addressed in commit P1 #1 — Runtime failures bypassed the provenance contractYou're right — Fix:
Two regression tests pin the fix:
P1 #2 —
|
…aged runtime wrapper (PR #111 review) Two prose claims in the v0.20 "Third-party adapter discovery" subsection were stale after the round-2 and round-3 review fixes: 1. "register on the live REGISTRY" — wrong since commit 14b943a. Adapters now register on a per-scan ``AdapterRegistry.clone()`` built at the start of each ``run_scan`` / ``inspect_sources``. The global ``REGISTRY`` stays builtin-only across the process. 2. "captures runtime exceptions … for callers that wrap the invocation themselves" — wrong since commit 943afda. The dispatcher in ``_load_sources`` now routes EVERY third-party adapter ``load()`` call through ``run_validated_adapter`` (both pass 1 per-source and pass 2 per-scan loops). Callers do not need to opt in. Rewrote the section to: - Document the per-scan registry contract explicitly and the two invariants it guarantees (``--no-plugins`` honest across scans; collision detection honest across scans). - Document the dispatcher's two trust mechanisms separately: artifact-class smuggling prevention (already covered for built-ins via ``_absorb`` ``TypeError``) and runtime-error capture (new third-party-only path through ``run_validated_adapter``). - Note that ``doctor`` (``inspect_sources``) uses the same per-scan registry + discovery + dispatcher path as ``scan`` and surfaces ``loaded_adapters[]`` in its payload. - Tighten the bad_protocol gate description to match the v0.20 PR #111 P2 #4 fix: "at least 3 positional slots (or *args)" and "no required keyword-only parameters" are now both gated, in addition to the "no more than 3 required positional" rule. - Add a new "Manifest ``tool_sources[].type``" paragraph documenting the Literal→str relaxation, the ``BUILTIN_TOOL_SOURCE_TYPES`` constant for documentation, the ``BUILTIN_PER_SCAN_ONLY_TOOL_SOURCE_TYPES`` rejection at manifest-load, and the ConfigError(2) dispatch-time fallback for unknown types. Pure docs change. ruff clean; public-surface contract tests pass (the existing drift guards already covered cross-doc consistency for the wedge positioning, gating signal, and schema version — none of which this edit touches). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Doc fix landed in commit You're right — two prose claims were stale relative to the review fixes:
Rewritten section now documents:
Pure docs change. Ruff clean; public-surface contract tests pass (the existing drift guards already covered cross-doc consistency for the wedge positioning, gating signal, and schema version — none of which this edit touches). 🤖 Generated with Claude Code |
The error from `AdapterRegistry.require()` used to say "Add the adapter to `_register_builtin_adapters()`" — contributor-facing and obsolete now that `tool_sources[].type` is open `str` and third-party adapters register via entry points. Agent mode also routed to "edit shipgate.yaml" via `SHIP-DIAG-INVALID-MANIFEST`, but the manifest itself is valid; the user needs to install/enable the matching adapter or fix a typo. Changes: - `inputs/protocol.py`: rewrite `require()` to branch on `_adapter_plugins_enabled(None)` and emit one of two remediations (install the adapter package, or enable discovery + install). - `schemas/diagnostics.py`: add `SHIP-DIAG-UNKNOWN-ADAPTER-SOURCE-TYPE`. - `cli/diagnostics.py`: add `diagnose_unknown_adapter_source_type()` that produces 2 or 3 ranked next_actions depending on discovery state. - `cli/_helpers.py`: pattern-match the `require()` error prefix in `_diagnose_config_error` and route to the new diagnostic instead of `SHIP-DIAG-INVALID-MANIFEST`. - Regression tests: `test_unknown_adapter_source_type_routes_to_install_enable_diagnostic` asserts agent-mode JSON `next_actions[0].kind == "command"` (not "edit") with `AGENTS_SHIPGATE_ENABLE_PLUGINS=1`; `test_require_error_message_is_user_facing` pins the error string away from the contributor-facing wording. - `docs/diagnostics.md`: add catalog entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Round-4 review fix: user-facing unknown-adapter remediation
Fixed in 92c4488. The fix is two-layer: 1.
No more "Add the adapter to 2. Agent mode no longer routes to "edit shipgate.yaml".
The manifest is valid; Regression tests in
Docs: Verification: 1772 passed, 4 skipped; ruff clean. |
Summary
Closes E4 from the round-3 architecture review (Nov/Dec 2025). Opens the same extension surface for adapters (input loaders) that M5 already opened for plugin checks. Adapters now register through the
agents_shipgate.adaptersentry-point group, gated by the existingAGENTS_SHIPGATE_ENABLE_PLUGINS=1env var and--no-pluginsCLI flag.Why
Third parties that wanted to ship a new input adapter (AutoGen, LlamaIndex, Semantic Kernel, additional ADK languages) had no path: the dispatcher's
REGISTRYwas builtin-only. The round-3 review noted M5 plugin validation made adapter discovery safe to land — this PR is that follow-up.How
Four load-time gates in a new
inputs/adapter_validation.py(parallel tochecks/plugin_validation.py):load_failed—entry_point.load()raised.bad_protocol— missing ClassVars, emptysource_type, or wrongload()arity.bad_scope—scopenot in{per_source, per_scan}.source_type_collision— load-bearing trust rule. Adapter'ssource_typecannot shadow a built-in (mcp,openapi,langchain, …) or another third-party adapter registered earlier in the same scan. Without this rule, a malicious plugin could displacemcpand intercept every scan that loads an MCP source.discover_third_party_adapters(registry, *, plugins_enabled, loaded_adapters)ininputs/protocol.pywalksentry_points("agents_shipgate.adapters"), validates each, registers the valid ones, and appends per-entry info dicts toloaded_adapters[]. Discovery runs before_load_sourcesso the dispatcher resolves third-party source types alongside built-ins.Report surface
New
ReadinessReport.loaded_adapters: list[dict[str, Any]]field — parallel shape toloaded_plugins[], additive within the existing v0.20 schema. Required + present on every emitted scan; empty list when--no-pluginsor no third-party adapters installed. Schema generator pins per-item required fields (name,value,distribution,version,source_type,validation_status,validation_errors,runtime_errors).--strict-plugins(extended)--strict-pluginsnow elevates exit code 4 on EITHER plugin failures (M5) OR adapter failures (this PR). The human-readable lines tag each failure aspluginoradapter.--no-pluginshelp text updated to mention adapter discovery is also disabled.Runtime safety net
run_validated_adapterininputs/adapter_validation.pywrapsadapter.load()and captures (a) exceptions, (b) wrong return type, (c) artifact-class smuggling — adapter declaresartifact_class=Xbut returns an artifact of another type. Mirrors theFinding.check_idsmuggling rule fromplugin_validation. The dispatcher's existing_absorbartifact-class check firesTypeErrorfor built-ins; this wrapper is opt-in for future adapter-execution paths.Test plan
tests/test_adapter_entry_point_discovery.py:source_type_collisionparametrized across 4 built-in source typessource_type_collisionbetween two third partiesAGENTS_SHIPGATE_ENABLE_PLUGINSenv (default off)--no-pluginsoverriding env (forces off)--strict-pluginsexits 4 end-to-end via CliRunnerrun_validated_adaptercatches exceptions, rejects wrong return type, rejects smuggled artifactscripts/generate_schemas.py --checkexits 0tests/test_schema_boundaries.pywire-field-order pin updated for the new fielddocs/report-sensitive-fields.jsoninventory extended for the privacy redaction layerOut of scope (intentional)
_load_sourcesinvokesadapter.load()directly. Wrapping every adapter withrun_validated_adapterwould change the contract for built-ins too; the wrapper is shipped as opt-in infrastructure for future adapter-execution work. The four load-time gates + the existing_absorbartifact-class check already cover the trust-bypass cases.samples/*/expected/report.jsonfiles don't yet carryloaded_adapters: []. The sample-report tests only pinreport_schema_version+release_decision.decision, so they pass; the next fixture regeneration will pick up the new field naturally.🤖 Generated with Claude Code