Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,74 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Added

- **BigQuery-table bundle mirror** in
`bigquery_agent_analytics.extractor_compilation.bq_bundle_mirror`
and
[`docs/extractor_compilation_bq_bundle_mirror.md`](docs/extractor_compilation_bq_bundle_mirror.md).
Issue [#75](https://github.com/GoogleCloudPlatform/BigQuery-Agent-Analytics-SDK/issues/75)
PR C2.c.3 — publishes compiled bundles to a BigQuery
table and syncs them back into a local directory for
C2.a's existing loader. Runtime path stays
``sync_bundles_from_bq → discover_bundles →
from_bundles_root``; the mirror is a utility, not a
runtime loader. Public surface:
``publish_bundles_to_bq(bundle_root, store,
bundle_fingerprint_allowlist=None)`` and
``sync_bundles_from_bq(store, dest_dir,
bundle_fingerprint_allowlist=None)``. Both call
:func:`load_bundle` as a gate — publish refuses bundles
that wouldn't load at the runtime; sync refuses bundles
whose reconstruction the loader rejects, scrubbing any
partial directory it wrote. Sync writes each
fingerprint to a side-by-side **staging directory** and
runs ``load_bundle`` on the staged copy before performing
a **staged replace** of the target (the rmtree+move pair
is not strictly atomic — a crash between the two leaves
the bundle absent on disk, recoverable by re-sync — but
the load-bundle-failure direction *is* atomic, so a bad
mirror row never destroys a previously-good local
bundle).
Strict bundle-shape check: the table stores exactly two
rows per fingerprint (``manifest.json`` + the manifest's
``module_filename``); ``unexpected_file`` codes reject
anything else. The manifest's own ``module_filename`` is
shape-checked at sync (bare filename — no separators, no
``..``, no NUL); a path-separator value surfaces as
``manifest_row_unreadable`` instead of raising
``FileNotFoundError`` at the write step.
``invalid_bundle_path`` rejects traversal / absolute /
backslash / NUL paths before writing to disk.
``duplicate_row`` rejects two rows sharing the same
``(fingerprint, bundle_path)`` (BigQuery has no unique
constraint; the mirror enforces uniqueness at sync).
``duplicate_fingerprint`` rejects publish-side cases
where two subdirs of ``bundle_root`` claim the same
manifest fingerprint — neither is published, so the
table can't end up with logical duplicates.
``malformed_row`` rejects rows with wrong field types.
Idempotent republish via DELETE+INSERT in
``BigQueryBundleStore.publish_rows`` —
re-publishing the same fingerprint replaces the prior
rows rather than accumulating duplicates. The DELETE +
``insert_rows_json`` are NOT a single atomic
transaction; a transient INSERT failure leaves rows
missing until the caller re-runs publish (recoverable;
documented in the class docstring).
``publish_rows`` also raises ``ValueError`` on duplicate
``(fingerprint, bundle_path)`` input pairs as defense in
depth.
``BundleStore`` is a Protocol so tests can pass in-memory
fakes; ``BigQueryBundleStore`` is the concrete
implementation wrapping ``google.cloud.bigquery``.
``BUNDLE_MIRROR_TABLE_SCHEMA`` is exported for callers
who need to create the table themselves (or
``BigQueryBundleStore.ensure_table()`` does it
idempotently). Failure codes are stable strings;
per-bundle problems land in ``failures`` instead of
raising. Store exceptions (BQ-side: network, auth, table
missing) propagate. Out of scope: GCS-backed signed-URL
fetch, caching / TTL, garbage collection, multi-region
replication.
- **Revalidation harness for compiled structured extractors**
in
`bigquery_agent_analytics.extractor_compilation.revalidation`
Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ architecture, rationale, and implementation plans behind key SDK features.
| [extractor_compilation_runtime_fallback.md](extractor_compilation_runtime_fallback.md) | Runtime fallback wiring for compiled structured extractors (issue #75 PR C2.b): `run_with_fallback(...)` returning `FallbackOutcome` (`decision` is one of `compiled_unchanged` / `compiled_filtered` / `fallback_for_event`). Validates compiled output via #76; on per-element failures drops just the offending nodes / edges (with orphan cleanup) AND downgrades the event's span from `fully_handled` to `partially_handled` so the AI transcript still sees the source span. EVENT-scope, exception, wrong-type, and unpinpointable failures all trigger whole-event fallback. Does not validate fallback output; fallback exceptions propagate. Orchestrator call-site swap is C2.c. |
| [extractor_compilation_runtime_registry.md](extractor_compilation_runtime_registry.md) | Runtime extractor-registry adapter (issue #75 PR C2.c.1): `build_runtime_extractor_registry(...)` glues C2.a's `discover_bundles` + C2.b's `run_with_fallback` into one call, returning a `WrappedRegistry` with an `extractors` dict ready for `run_structured_extractors` plus `bundles_without_fallback` (compiled-only, skipped) and `fallbacks_without_bundle` (no usable compiled registry entry — "never built" *and* "rejected by discovery"; cross-reference `discovery.failures` for the reason). Compiled-only event_types are skipped and recorded (fail-closed); fallback-only event_types pass through unchanged. Non-callable fallbacks are rejected at build time with `TypeError` naming the event_type. The `on_outcome(event_type, outcome)` callback fires on every wrapped invocation (denominator metric); callback exceptions propagate. Out of scope: actual orchestrator call-site swap (C2.c.2), BQ mirror (C2.c.3), revalidation (C2.d). |
| [extractor_compilation_orchestrator_swap.md](extractor_compilation_orchestrator_swap.md) | Orchestrator call-site swap (issue #75 PR C2.c.2): `OntologyGraphManager.from_bundles_root(...)` classmethod that builds the runtime registry internally and constructs a manager whose `extractors` dict is the wrapped registry, so existing `run_structured_extractors` calls inside `extract_graph` pick up compiled-with-fallback behavior with no other code changes. Adds `manager.runtime_registry: WrappedRegistry | None` audit handle (non-None when bundle-wired). Mirrors `from_ontology_binding` arg shape; existing `__init__` and `from_ontology_binding` paths are unchanged. Compiled-only event_types without a matching fallback are NOT registered (fail-closed). Out of scope: BQ mirror (C2.c.3), revalidation (C2.d). |
| [extractor_compilation_bq_bundle_mirror.md](extractor_compilation_bq_bundle_mirror.md) | BigQuery-table bundle mirror (issue #75 PR C2.c.3): `publish_bundles_to_bq(bundle_root, store, ...)` + `sync_bundles_from_bq(store, dest_dir, ...)`. Mirror is a publish/sync utility, NOT a runtime loader — the runtime path stays `sync_bundles_from_bq → discover_bundles → from_bundles_root`. Both functions call `load_bundle` as a gate: publish refuses bundles that wouldn't load at the runtime; sync writes to a side-by-side **staging directory** and `load_bundle`-validates the staged copy before performing a **staged replace** of the target (the rmtree+move pair is not strictly atomic — a crash between the two leaves the bundle absent on disk and is recoverable by re-sync — but the load-bundle-failure direction *is* atomic, so a bad mirror row never destroys a previously-good local bundle). Strict bundle-shape (exactly `manifest.json` + the manifest's `module_filename`) plus shape-check on the manifest's `module_filename` (bare filename only — no separators, no `..`, no NUL; otherwise `manifest_row_unreadable`). Path-safety rejects traversal / absolute / backslash / NUL. `duplicate_fingerprint` rejects publish-side cases where two subdirs claim the same fingerprint (neither published). `duplicate_row` rejects two rows sharing the same `(fingerprint, bundle_path)` at sync. `malformed_row` shape check. Idempotent republish via DELETE+INSERT in `BigQueryBundleStore.publish_rows` (NOT a single atomic transaction; a transient INSERT failure is recoverable by re-running publish). `publish_rows` raises `ValueError` on duplicate input pairs as defense in depth. `BundleStore` Protocol for testability; `BigQueryBundleStore` is the concrete impl. Stable `MirrorFailure` codes; per-bundle problems accumulate, store exceptions propagate. Out of scope: GCS signed URLs, caching, garbage collection, multi-region. |
| [extractor_compilation_revalidation.md](extractor_compilation_revalidation.md) | Revalidation harness (issue #75 PR C2.d): `revalidate_compiled_extractors(events, compiled_extractors, reference_extractors, resolved_graph, ...)` drives `run_with_fallback` (with a no-op fallback) over a batch of events AND calls the reference extractor directly, aggregating outcomes into a `RevalidationReport` with **two orthogonal dimensions**: runtime decision (`compiled_unchanged` / `compiled_filtered` / `fallback_for_event`, plus `compiled_path_faults` split out so bundle bugs are distinguishable from ontology drift) and agreement against reference (`parity_match` / `parity_divergence` / `parity_not_checked`). Parity uses three comparators: `_compare_nodes` and `_compare_span_handling` from `measurement.py` plus `_compare_edges` in `revalidation.py` (same edge_id set with matching relationship_name / endpoints / property-set per shared edge; duplicate edge_ids on either side reported as a divergence rather than silently collapsed by dict keying). The parity dimension catches **schema-valid but semantically wrong** outputs the schema-only check would miss. **Every failure mode on the reference side becomes a parity divergence, never a batch abort**: exceptions, non-`StructuredExtractionResult` returns (including `None`), and comparator crashes all funnel into the divergence channel with a descriptive string. `check_thresholds(report, RevalidationThresholds(...))` evaluates policy gates; threshold rates are validated to `[0, 1]` at construction so a typo like `=5` (intended as 5%) fails loud. JSON-serializable for persistence; deterministic. Out of scope: scheduled orchestration, BQ persistence, CLI, sampling strategy. |

## Deployment Surfaces
Expand Down
Loading
Loading