Skip to content

feat: stable _:fdb blank-node ids addressable in queries and updates#1432

Merged
bplatz merged 7 commits into
mainfrom
feature/stable-fdb-blank-nodes
Jul 5, 2026
Merged

feat: stable _:fdb blank-node ids addressable in queries and updates#1432
bplatz merged 7 commits into
mainfrom
feature/stable-fdb-blank-nodes

Conversation

@bplatz

@bplatz bplatz commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

What

Labels with the reserved _:fdb- prefix (the skolem ids Fluree mints for every blank node and returns from queries) now denote the existing stored node in queries and transactions, instead of silently minting a fresh node. This makes blank-node-rooted structures — OWL restrictions, address objects, RDF lists — editable in place: query the node's id, then use it as an ordinary @id to insert/delete individual triples, without retracting and re-asserting the whole subtree.

The read side already resolved _:fdb-... to the stored Sid (canonical_split routes any _: string to the blank-node namespace); this closes the write side and SPARQL:

  • JSON-LD transactions@id: "_:fdb-..." in insert/update templates and VALUES addresses the stored node
  • SPARQL SELECT_:fdb-... in WHERE patterns lowers to a constant; ordinary labels stay non-distinguished variables per spec
  • SPARQL UPDATE — INSERT/DELETE templates, DELETE DATA/INSERT DATA, and DELETE WHERE all resolve stable ids (a bnode in a DELETE template was previously skolemized fresh, i.e. a silent no-op)
  • Turtle/TriGFlakeSink::term_blank and the upsert named-graph path pass stable labels through
  • Edge annotations — delete-by-@id accepts stable ids (they are now addressable); client-authored blank labels still rejected

Ordinary labels (_:b0) keep standard RDF semantics: fresh node per transaction on write, existential variable in SPARQL WHERE. Clients cannot collide with the reserved space because skolemization wraps client labels with a txn id before prefixing fdb-. All checks run at parse/lowering time inside existing _:-prefixed branches — zero cost for transactions not using stable ids.

Spec framing: this is RDF 1.1 §3.5 skolemization kept in blank-node syntax — the same extension Virtuoso's nodeID:// refs and Jena's <_:label> syntax provide. SPARQL §3.1.3's ban on blanks in DELETE templates exists because such labels could only denote fresh nodes; the reserved prefix removes that premise.

Import fixes (found while auditing blank-node scoping)

  • TriG named-graph blank nodes scoped to their document: named-graph blanks were minted from the bare label (fdb-b0) — colliding globally across every document imported into a ledger while diverging from the same document's default graph (fdb-{ledger}-{t}-b0). They now share the default graph's skolem key, so a label unifies across the default graph and GRAPH blocks of one document and stays distinct across documents.
  • Serial-import namespace codes allocated through the shared allocator: the serial TriG/N-Quads commit path allocated namespace codes only in state.ns_registry, invisible to the spool's mid-parse code→prefix lookups. The silent empty-prefix fallback permanently wrote suffix-only strings (name instead of http://schema.org/name) into the index's predicate dict — bound-predicate patterns matched nothing on freshly imported TriG ledgers. Serial commits now mirror parse_chunk's WorkerCache-over-shared-allocator discipline, and the fallback debug_asserts (warns in release) so any recurrence fails tests loudly instead of corrupting silently.

Testing

  • it_stable_blank_nodes.rs — 7 tests: in-place OWL-restriction edits via JSON-LD and SPARQL, node identity stable across edits, fresh-mint semantics preserved for ordinary labels, constant-vs-variable WHERE behavior
  • it_import.rsimport_trig_blank_label_document_scoped (fails pre-fix) and import_trig_bound_predicate_queryable (fails pre-fix); should_panic unit test pins the spool prefix-miss detection
  • W3C SPARQL eval: 283/327 before and after (zero regression, verified by stashing the changes and re-running)
  • Full runs green: fluree-db-transact lib (277), grp_import (82), grp_transact (168), grp_query_sparql (258), edge annotations (78); clippy all-features clean

Docs

  • New "Editing Blank-Node Structures (Stable _:fdb- Ids)" section in docs/transactions/update-where-delete-insert.md, with pointers from insert.md and concepts/iri-and-context.md

bplatz added 5 commits July 4, 2026 11:56
…d updates

Fluree skolemizes every blank node into the reserved _:fdb-... label
space at insert time, and queries already resolved those labels to the
stored Sid (canonical_split routes any _: string to the blank-node
namespace). Writes, however, re-skolemized them: round-tripping an
_:fdb-... id from a query silently minted a fresh node, and a SPARQL
DELETE naming one retracted nothing. Blank-node-rooted structures (OWL
restrictions, address objects) could only be edited by retracting and
re-asserting the whole subtree.

Labels with the reserved fdb- prefix now denote the existing stored
node everywhere (RDF 1.1 §3.5 skolemization kept in blank-node syntax):

- JSON-LD templates: parse_expanded_id_with_ctx and VALUES @id resolve
  stable ids to constant Sids
- SPARQL SELECT WHERE: stable labels lower to constants; other labels
  stay non-distinguished variables per spec
- SPARQL UPDATE: INSERT/DELETE templates, DELETE DATA/INSERT DATA, and
  DELETE WHERE (both pattern and template sides) resolve stable ids
- Turtle: FlakeSink::term_blank passes stable labels through
- Edge annotations: delete-by-@id now accepts stable ids (they are
  addressable), still rejects client-authored blank labels

Ordinary labels (_:b0) keep standard semantics: fresh node per
transaction on write, existential variable in SPARQL WHERE. Clients
cannot collide with the reserved space because transaction
skolemization wraps client labels with a txn id before prefixing fdb-.

All checks happen at parse/lowering time (once per term, inside the
existing _:-prefixed branches), never in per-solution template
instantiation, so transactions that don't use stable ids pay nothing.

W3C SPARQL eval: 283/327 before and after (no regression).
Named-graph blanks in bulk TriG/N-Quads import were skolemized from the
bare label (fdb-b0) while the same document's default graph used the
ImportSink key (fdb-{ledger}-{t}-{label}). Two consequences, both wrong
under TriG's document-wide label scoping:

- _:b0 in a GRAPH block and _:b0 in the default graph of the SAME
  document became two different nodes
- _:b0 in GRAPH blocks of DIFFERENT documents (files, commits, or later
  imports into the same ledger) collapsed into one globally shared node

expand_term/expand_object now take the commit's txn_id and mint
{txn_id}-{label}, matching ImportSink::skolemize, so a label unifies
across the default graph and GRAPH blocks of one document and stays
distinct across documents.

Also extends stable _:fdb- id support to the upsert/insert-turtle
named-graph path (convert_named_graphs_to_templates), which was the one
write path still skolemizing stable ids fresh; ordinary labels keep
fresh-mint semantics there.

The regression test uses variable-predicate scans because bound-
predicate patterns return empty on freshly bulk-imported ledgers — a
pre-existing quirk unrelated to blank nodes, reproduced with this fix
fully reverted.
…d allocator

The serial import commit functions (import_trig_commit, and import_commit
via the no-GRAPH-blocks TriG fallback) allocated namespace codes only in
state.ns_registry. SpoolContext resolves code->prefix through the SHARED
allocator at record-push time — mid-parse, right after an @Prefix
directive allocates the code — and silently falls back to an empty
prefix on a miss. The index's predicate dictionary then permanently
stored bare suffixes ("name" instead of "http://schema.org/name"), so
bound-predicate patterns matched nothing on freshly imported ledgers and
results rendered suffix-only predicates. The orchestrator's
sync_from_registry after each commit was too late by construction.

Serial commit fns now mirror parse_chunk's allocator discipline:
ImportSink::new_cached over a WorkerCache backed by the spool's shared
allocator, named-graph expansion helpers take the WorkerCache, and each
commit's new codes merge back into the registry via lookup_codes +
adopt_delta_for_persistence before take_delta() builds the commit's
namespace_delta. Without a spool, a throwaway allocator seeded from the
registry preserves identical code assignment.

Pure-Turtle files were never affected (they go through parse_chunk).
SpoolContext's code->prefix resolution silently fell back to an empty
prefix when the shared allocator didn't know a namespace code — the
failure mode behind the suffix-only predicate dict entries fixed in the
previous commit, which corrupted the index without any signal.

A miss for a real (non-OVERFLOW) code now debug_asserts, so any test
exercising a misallocated path fails immediately, and warns in release
instead of corrupting silently. OVERFLOW Sids keep the empty prefix
(they legitimately carry the full IRI in the name).

Adds a should_panic unit test pinning the detection, and rewires the
three spool unit tests that hand-built the old registry-only allocation
pattern onto the production WorkerCache-over-shared-allocator wiring.
@bplatz bplatz requested review from aaj3f and zonotope July 4, 2026 19:05

@aaj3f aaj3f left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense and looks good! 👍

Base automatically changed from feature/warm-on-write-reindex-cache to main July 5, 2026 15:42
bplatz added 2 commits July 5, 2026 11:43
The trig2 literal has no inner double quotes, so `r#"…"#` tripped
clippy::needless_raw_string_hashes under -D warnings on CI. Use a plain `r"…"`.
trig1 keeps its hashes — it contains a quoted literal.
@bplatz bplatz merged commit 15e2a4b into main Jul 5, 2026
6 checks passed
@bplatz bplatz deleted the feature/stable-fdb-blank-nodes branch July 5, 2026 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants