fix(query): cross-ledger GRAPH queries over indexed data (#1405 bugs 2+3) by Jackamus29 · Pull Request #1425 · fluree/db

Jackamus29 · 2026-07-03T01:05:16Z

Partially addresses #1405 — makes the documented GRAPH-scoped workaround actually work over indexed, multi-ledger, divergent-namespace data. Fixes failures 2 and 3 from the issue. Failure 1 (lifting the union-path guard via cross-snapshot BFS) is a committed follow-up.

Failures fixed

Bug 2 — indexed GRAPH path → internal error. A GRAPH-scoped path over an indexed multi-ledger dataset hit EncodedSid/EncodedPid reached stamp_provenance. DatasetOperator computed multi_ledger from the active graphs only, so a single default graph (alongside named graphs from other ledgers) was treated as single-ledger — the binary store stayed on, its scans emitted late EncodedSid, and those seeded a GRAPH block / crossed a boundary into provenance stamping, which can't decode them. Fix: treat the scan as multi-ledger when the whole dataset spans ledgers, forcing full Binding::Sid materialization (which stamps to IriMatch).

Bug 3 — divergent-namespace cross-graph join/path → silent []. The materialization fix resolves the join/seed cases (bound keys now cross as IriMatch and re-encode). Property paths additionally matched pattern constants (endpoints) and predicates against the primary snapshot's codes rather than the per-GRAPH graph's. Fix: re-encode path predicate + constant-endpoint SIDs into the active graph's dict (reencode_pred + the Ref::Sid arm of resolve_sid), reusing binary_scan's decode-primary/encode-target idiom.

Scope / safety

Single-ledger and single-graph queries unchanged (re-encode round-trips to the same SID; materialization only engages for multi-ledger datasets).
The union-path guard (failure 1) stays in place — its q2 characterization test asserts it still fires.
Tradeoff: multi-ledger datasets now eagerly materialize even a single default-graph scan (a scoped cost), which is the price of correct cross-boundary provenance — consistent with the existing multi-ledger disable.

Tests — `tests/it_multi_graph_property_path.rs`

The q1–q3d repro (adopted from the issue) plus a usage-pattern matrix, all indexed with divergent namespace codes and exact-value assertions:

test	pins
q3a / q3c	bug 2 (indexed GRAPH path) / bug 3 (divergent join)
A1	object-position cross-graph join
A2 / A3	multi-value completeness / precision (include match, exclude non-match)
A4	`p+` strict path + join (bug 2 + 3 combined)
A5	three-ledger chained join (re-encode composes)
A6	single-ledger GRAPH path unchanged (regression)
A7	`FILTER EXISTS` across a GRAPH boundary (covered by the root-cause fix)
A8	unbounded closure with a divergent-code predicate

Verification: 14/14 in the new suite; fluree-db-query 1170+; grp_query 299, grp_query_sparql 258, grp_misc 237; fmt + clippy clean.

Follow-up (separate PR)

Failure 1 — lift the guard with a cross-snapshot BFS so a property path runs directly over a multi-graph union (p+ across ledgers), honoring the "query your datasets as if they were one materialized dataset" model.

🤖 Generated with Claude Code

…1405) Two composed failures prevented property paths + cross-graph joins from working over an indexed, multi-ledger dataset: Bug 2 — a GRAPH-scoped query over an INDEXED multi-ledger dataset hit an internal invariant ("EncodedSid/EncodedPid reached stamp_provenance"). Root cause: `DatasetOperator` computed `multi_ledger` from the *active* graphs only, so a single default graph (alongside named graphs from other ledgers) was treated as single-ledger — the binary store stayed enabled and its scans emitted late `Binding::EncodedSid`, which then seeded a GRAPH block / crossed a boundary and reached provenance stamping, which cannot decode them. Fix: also treat the scan as multi-ledger when the whole dataset spans ledgers, forcing full `Binding::Sid` materialization (which stamps to `IriMatch`). Bug 3 — a cross-`GRAPH` join or path over DIVERGENT namespace codes silently returned []. The materialization fix above resolves the join/seed cases (bound keys now cross as `IriMatch` and re-encode). For property paths specifically, the operator also matched pattern *constants* (endpoints) and *predicates* against the primary/lowering snapshot's codes rather than the per-GRAPH graph's, so a divergent-namespace endpoint/predicate found nothing. Fix: re-encode path predicate and constant-endpoint SIDs into the active graph's dict (`reencode_pred` + the `Ref::Sid` arm of `resolve_sid`), reusing the same decode-primary/encode-target idiom as `binary_scan`'s `reencode_sid`. Single-ledger and single-graph queries are unchanged (re-encode round-trips to the same SID; materialization only kicks in for multi-ledger datasets). The union-path guard (failure 1 in #1405) is intentionally left in place. Adds tests/it_multi_graph_property_path.rs: the q1-q3d repro plus a usage matrix (A1-A8) covering object-position joins, multi-value completeness, precision, strict paths, three-ledger chains, single-ledger regression, FILTER EXISTS, and unbounded closures — all over indexed, divergent-namespace ledgers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Review feedback: - Trim the engine-side comments to the load-bearing invariant and drop #1405 references from implementation code (db engineers know the mechanics; issue refs belong in the regression tests, which keep them). - Rewrite the test data from generic `ex:thing`/`ex:category`/`narrow`/`mid`/ `top` into a concrete library/subjects domain with distinct per-ledger prefixes, matching the house style of the sibling cross-ledger tests: `lib:book1` (library.example) references a `subj:` subject taxonomy (jazz ⊂ music ⊂ arts via `subj:broader`, subject.example). The own-prefix-first / shared-via-ref seeding that produces namespace-code divergence is preserved, so the aligned (q3d warm-up) vs divergent contrast still holds. No behavior change; 14/14 in the suite, fmt + clippy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Jackamus29 marked this pull request as draft July 3, 2026 01:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(query): cross-ledger GRAPH queries over indexed data (#1405 bugs 2+3)#1425

fix(query): cross-ledger GRAPH queries over indexed data (#1405 bugs 2+3)#1425
Jackamus29 wants to merge 2 commits into
mainfrom
fix/1405-cross-ledger-graph

Jackamus29 commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Jackamus29 commented Jul 3, 2026

Failures fixed

Scope / safety

Tests — tests/it_multi_graph_property_path.rs

Follow-up (separate PR)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tests — `tests/it_multi_graph_property_path.rs`