feat: stable _:fdb blank-node ids addressable in queries and updates#1432
Merged
Conversation
…d updates Fluree skolemizes every blank node into the reserved _:fdb-... label space at insert time, and queries already resolved those labels to the stored Sid (canonical_split routes any _: string to the blank-node namespace). Writes, however, re-skolemized them: round-tripping an _:fdb-... id from a query silently minted a fresh node, and a SPARQL DELETE naming one retracted nothing. Blank-node-rooted structures (OWL restrictions, address objects) could only be edited by retracting and re-asserting the whole subtree. Labels with the reserved fdb- prefix now denote the existing stored node everywhere (RDF 1.1 §3.5 skolemization kept in blank-node syntax): - JSON-LD templates: parse_expanded_id_with_ctx and VALUES @id resolve stable ids to constant Sids - SPARQL SELECT WHERE: stable labels lower to constants; other labels stay non-distinguished variables per spec - SPARQL UPDATE: INSERT/DELETE templates, DELETE DATA/INSERT DATA, and DELETE WHERE (both pattern and template sides) resolve stable ids - Turtle: FlakeSink::term_blank passes stable labels through - Edge annotations: delete-by-@id now accepts stable ids (they are addressable), still rejects client-authored blank labels Ordinary labels (_:b0) keep standard semantics: fresh node per transaction on write, existential variable in SPARQL WHERE. Clients cannot collide with the reserved space because transaction skolemization wraps client labels with a txn id before prefixing fdb-. All checks happen at parse/lowering time (once per term, inside the existing _:-prefixed branches), never in per-solution template instantiation, so transactions that don't use stable ids pay nothing. W3C SPARQL eval: 283/327 before and after (no regression).
Named-graph blanks in bulk TriG/N-Quads import were skolemized from the
bare label (fdb-b0) while the same document's default graph used the
ImportSink key (fdb-{ledger}-{t}-{label}). Two consequences, both wrong
under TriG's document-wide label scoping:
- _:b0 in a GRAPH block and _:b0 in the default graph of the SAME
document became two different nodes
- _:b0 in GRAPH blocks of DIFFERENT documents (files, commits, or later
imports into the same ledger) collapsed into one globally shared node
expand_term/expand_object now take the commit's txn_id and mint
{txn_id}-{label}, matching ImportSink::skolemize, so a label unifies
across the default graph and GRAPH blocks of one document and stays
distinct across documents.
Also extends stable _:fdb- id support to the upsert/insert-turtle
named-graph path (convert_named_graphs_to_templates), which was the one
write path still skolemizing stable ids fresh; ordinary labels keep
fresh-mint semantics there.
The regression test uses variable-predicate scans because bound-
predicate patterns return empty on freshly bulk-imported ledgers — a
pre-existing quirk unrelated to blank nodes, reproduced with this fix
fully reverted.
…d allocator The serial import commit functions (import_trig_commit, and import_commit via the no-GRAPH-blocks TriG fallback) allocated namespace codes only in state.ns_registry. SpoolContext resolves code->prefix through the SHARED allocator at record-push time — mid-parse, right after an @Prefix directive allocates the code — and silently falls back to an empty prefix on a miss. The index's predicate dictionary then permanently stored bare suffixes ("name" instead of "http://schema.org/name"), so bound-predicate patterns matched nothing on freshly imported ledgers and results rendered suffix-only predicates. The orchestrator's sync_from_registry after each commit was too late by construction. Serial commit fns now mirror parse_chunk's allocator discipline: ImportSink::new_cached over a WorkerCache backed by the spool's shared allocator, named-graph expansion helpers take the WorkerCache, and each commit's new codes merge back into the registry via lookup_codes + adopt_delta_for_persistence before take_delta() builds the commit's namespace_delta. Without a spool, a throwaway allocator seeded from the registry preserves identical code assignment. Pure-Turtle files were never affected (they go through parse_chunk).
SpoolContext's code->prefix resolution silently fell back to an empty prefix when the shared allocator didn't know a namespace code — the failure mode behind the suffix-only predicate dict entries fixed in the previous commit, which corrupted the index without any signal. A miss for a real (non-OVERFLOW) code now debug_asserts, so any test exercising a misallocated path fails immediately, and warns in release instead of corrupting silently. OVERFLOW Sids keep the empty prefix (they legitimately carry the full IRI in the name). Adds a should_panic unit test pinning the detection, and rewires the three spool unit tests that hand-built the old registry-only allocation pattern onto the production WorkerCache-over-shared-allocator wiring.
aaj3f
approved these changes
Jul 5, 2026
aaj3f
left a comment
Contributor
There was a problem hiding this comment.
Makes sense and looks good! 👍
The trig2 literal has no inner double quotes, so `r#"…"#` tripped clippy::needless_raw_string_hashes under -D warnings on CI. Use a plain `r"…"`. trig1 keeps its hashes — it contains a quoted literal.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Labels with the reserved
_:fdb-prefix (the skolem ids Fluree mints for every blank node and returns from queries) now denote the existing stored node in queries and transactions, instead of silently minting a fresh node. This makes blank-node-rooted structures — OWL restrictions, address objects, RDF lists — editable in place: query the node's id, then use it as an ordinary@idto insert/delete individual triples, without retracting and re-asserting the whole subtree.The read side already resolved
_:fdb-...to the stored Sid (canonical_splitroutes any_:string to the blank-node namespace); this closes the write side and SPARQL:@id: "_:fdb-..."in insert/update templates and VALUES addresses the stored node_:fdb-...in WHERE patterns lowers to a constant; ordinary labels stay non-distinguished variables per specDELETE DATA/INSERT DATA, andDELETE WHEREall resolve stable ids (a bnode in a DELETE template was previously skolemized fresh, i.e. a silent no-op)FlakeSink::term_blankand the upsert named-graph path pass stable labels throughOrdinary labels (
_:b0) keep standard RDF semantics: fresh node per transaction on write, existential variable in SPARQL WHERE. Clients cannot collide with the reserved space because skolemization wraps client labels with a txn id before prefixingfdb-. All checks run at parse/lowering time inside existing_:-prefixed branches — zero cost for transactions not using stable ids.Spec framing: this is RDF 1.1 §3.5 skolemization kept in blank-node syntax — the same extension Virtuoso's
nodeID://refs and Jena's<_:label>syntax provide. SPARQL §3.1.3's ban on blanks in DELETE templates exists because such labels could only denote fresh nodes; the reserved prefix removes that premise.Import fixes (found while auditing blank-node scoping)
fdb-b0) — colliding globally across every document imported into a ledger while diverging from the same document's default graph (fdb-{ledger}-{t}-b0). They now share the default graph's skolem key, so a label unifies across the default graph and GRAPH blocks of one document and stays distinct across documents.state.ns_registry, invisible to the spool's mid-parse code→prefix lookups. The silent empty-prefix fallback permanently wrote suffix-only strings (nameinstead ofhttp://schema.org/name) into the index's predicate dict — bound-predicate patterns matched nothing on freshly imported TriG ledgers. Serial commits now mirrorparse_chunk's WorkerCache-over-shared-allocator discipline, and the fallbackdebug_asserts (warns in release) so any recurrence fails tests loudly instead of corrupting silently.Testing
it_stable_blank_nodes.rs— 7 tests: in-place OWL-restriction edits via JSON-LD and SPARQL, node identity stable across edits, fresh-mint semantics preserved for ordinary labels, constant-vs-variable WHERE behaviorit_import.rs—import_trig_blank_label_document_scoped(fails pre-fix) andimport_trig_bound_predicate_queryable(fails pre-fix);should_panicunit test pins the spool prefix-miss detectionDocs
_:fdb-Ids)" section indocs/transactions/update-where-delete-insert.md, with pointers frominsert.mdandconcepts/iri-and-context.md