feat: sh:class value-sets from the shapes graph, same-ledger and cross-ledger#1420
Conversation
sh:class value membership was looked up only in the focus node's own data
graph, so an enumerated value-set (e.g. a controlled list of US states)
stored in a separate shapes/vocabulary graph produced false violations.
Membership now resolves against the union of the focus node's data graph
and the f:shapesSource vocabulary graph(s): rdf:type is read across
{data graph} + {membership graphs} and the subClassOf walk across
{g_id=0} + {membership graphs} via a generalized rescope_to_graph. A
per-transaction memo on ShaclEngine (keyed by value, class, focus graph)
collapses repeated checks within one transaction; cache hits skip the
range scan and its fuel charge.
The membership context is threaded as an Option through
validate_shape -> validate_property_shape -> validate_class_constraint;
referenced/nested-shape paths pass None and keep the prior behavior.
validate_view_with_shacl gains a membership_g_ids parameter fed from the
resolved f:shapesSource graph ids. Same-ledger only; cross-ledger
value-sets remain deferred.
Adds shacl_class_value_set_in_shapes_graph (cross-graph positive, per-txn
cache batch, negative) and updates the SHACL implementation guide and
cookbook.
Extends sh:class value-set membership to the cross-ledger case: when
f:shapesSource points at a model ledger M (via f:ledger), the controlled
vocabulary lives in M alongside the shapes. M's instance typings
(ex:illinois a ex:USState) are ABox and are not carried in the shapes
wire, so membership is resolved by querying M live.
stage_with_config_shacl opens M at the resolved t (latest committed at
transaction time) and threads a CrossLedgerMembership { model_db,
data_ns_map } down to validate_class_constraint. On a local miss,
value_conforms_cross_ledger decodes the value/class Sids to IRIs via D's
staged namespace map — the base snapshot alone can't decode namespaces
staged this transaction — then re-encodes against M (whose split mode may
differ) and does the rdf:type + subClassOf lookup in M's term space.
Well-known predicate codes are global, so only user IRIs are translated.
The per-transaction memo covers cross-ledger verdicts too.
Same-ledger behavior is unchanged; referenced/nested-shape sh:class keeps
the legacy local lookup. Adds an end-to-end two-ledger test and updates
the SHACL implementation guide and cookbook.
aaj3f
left a comment
There was a problem hiding this comment.
This is a nice addition and makes sense & opens capabilities I'm excited about in Fluree AI producting
| let cross_ledger_membership = match ( | ||
| &cross_ledger_model_db, | ||
| &cross_ledger_data_ns_map, | ||
| cross_ledger_shapes.as_deref(), | ||
| ) { | ||
| (Some(m_db), Some(ns_map), Some(resolved)) => { | ||
| crate::cross_ledger::resolve_selector_g_id(&m_db.snapshot, &resolved.graph_iri) | ||
| .map_err(|e| { | ||
| fluree_db_transact::TransactError::Parse(format!( | ||
| "cross-ledger value-set graph resolution failed: {e}" | ||
| )) | ||
| })? | ||
| .map(|g_id| fluree_db_shacl::CrossLedgerMembership { | ||
| model_db: fluree_db_core::GraphDbRef::new( | ||
| &m_db.snapshot, | ||
| g_id, | ||
| m_db.overlay.as_ref(), | ||
| m_db.t, | ||
| ), | ||
| data_ns_map: ns_map, | ||
| }) | ||
| } | ||
| _ => None, | ||
| }; |
There was a problem hiding this comment.
When resolve_selector_g_id returns Ok(None) (the shapes-source graph IRI is absent from M's registry at the resolved t), .map(..) yields None, silently disabling cross-ledger membership — every cross-ledger value then falls through to "not a member" and the write is rejected with false ShaclViolations. Contrast the Err case one line up, which is propagated with context, and the ReservedGraphSelected error inside the resolver. In practice shape resolution already guarantees the graph exists (shapes were loaded from it), so this branch should be unreachable — which is exactly why a silent None here is a latent trap. Prefer an explicit error (or at least a debug_assert!) so a future divergence between shape-graph and value-set-graph resolution surfaces loudly rather than as spurious violations:
.map_err(|e| { /* ... */ })?
.ok_or_else(|| fluree_db_transact::TransactError::Parse(format!(
"cross-ledger value-set graph {} not present in model ledger {} at t={} \
(shapes resolved but vocabulary graph missing)",
resolved.graph_iri, resolved.model_ledger_id, resolved.resolved_t
)))
.map(|g_id| /* CrossLedgerMembership { .. } */ )(Adjust the surrounding match so this arm yields Some(..).) If the silent fallback is intentional, add a one-line comment saying so and why false violations are acceptable there.
…silent drop When resolving the model ledger's value-set graph for cross-ledger sh:class membership, a `resolve_selector_g_id` result of `Ok(None)` previously mapped to `None` membership, silently. With no membership handle, every M-only value falls through to "not a member" (value_conforms_to_class → Ok(false)) and the write is rejected with spurious sh:class violations. The vocabulary lives in the same M graph the shapes were compiled from, so this miss should be unreachable — surface it as an explicit error (in all build profiles, unlike a debug_assert) so any future divergence between shape-graph and value-set-graph resolution fails loudly rather than as confusing false rejects. Happy path unchanged.
Motivation
sh:classvalue membership was looked up only in the focus node's own data graph. A controlled value-set modeled as class membership — e.g. US states asex:illinois rdf:type ex:USState— stored in a shapes/vocabulary graph (or in a governance "model" ledger) produced false violations, because the value'srdf:typetriples were never visible to the constraint check. This blocked the pattern where a shapes graph carries both the SHACL rules and the enumerations they validate against.What this does
Commit 1 — same-ledger (
f:shapesSourcegraph):sh:classmembership now resolves against the union of the focus node's data graph and thef:shapesSourcevocabulary graph(s):rdf:typeis read across{data graph} ∪ {membership graphs}, and therdfs:subClassOfwalk spans{g_id=0} ∪ {membership graphs}via a generalizedrescope_to_graph.ShaclEngine(keyed by(value, class, focus graph)) collapses repeated checks within one transaction — inserting 500 rows that all referenceex:illinoiscosts one membership scan, not 500; cache hits also skip the fuel charge.Optionthroughvalidate_shape → validate_property_shape → validate_class_constraint; referenced/nested-shape paths passNoneand keep prior behavior.validate_view_with_shaclgains amembership_g_idsparameter fed from the resolvedf:shapesSourcegraph ids.Commit 2 — cross-ledger (model ledger):
f:shapesSourcepoints at a model ledger M viaf:ledger, the controlled vocabulary lives in M alongside the shapes. M's instance typings are ABox and are not carried in the shapes wire, so membership is resolved by querying M live, pinned at the resolved t (latest committed M at transaction time) — the same t the shape resolution used, so shapes and vocabulary can't disagree about M's version.stage_with_config_shaclopens M at that t and threadsCrossLedgerMembership { model_db, data_ns_map }down to the constraint check. On a local miss, the value/class Sids are decoded to IRIs via D's staged namespace map (the base snapshot can't decode namespaces introduced by the in-flight transaction), then re-encoded against M — robust to differing namespace split modes — and therdf:type+subClassOflookup runs in M's term space. Well-known predicate codes are global, so only user IRIs are translated.Tests
it_config_graph.rs::shacl_class_value_set_in_shapes_graph— same-ledger: cross-graph positive (value typed only in the shapes graph accepted), batch insert exercising the per-txn cache, and a negative (non-member rejected).it_shapes_cross_ledger.rs::cross_ledger_sh_class_value_set_resolved_against_model_ledger— two-ledger end-to-end:ex:illinois(typed only in M) satisfiessh:class ex:USStateon a write to D;ex:atlantisis rejected with aShaclViolation.The cross-ledger decode path was verified against a deliberate regression: using the base snapshot instead of the staged namespace map reproduces the false violation the staged-map threading exists to prevent.
Docs
docs/guides/cookbook-shacl.md— new "Shared value-sets withsh:class" section (TriG example, union semantics, per-transaction caching, model-ledger pattern).docs/contributing/shacl-implementation.md— compile-graph vs validate-graph distinction, membership union, memo lifetime, and the cross-ledger translation mechanism.Related
While wiring this, the same config-graph audit surfaced that
f:policySourcewas enforced on reads but ignored on writes — filed as #1416 and fixed in #1419, which is stacked on this branch.