v0.4.0 #38
benoitcayladbx
announced in
Announcements
v0.4.0
#38
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
OntoBricks — Release Notes V0.4.0
Release window: May, 2026
Test status: all changes shipped with the suite green (≥ 2003 passing, 80 skipped).
Highlights
—
app_managed(direct streaming into Postgres)—
managed_synced(Lakeflow-managed Unity Catalog synced-table pipeline).Both modes share the same 3-object Postgres layout (
*_sync+*__appcompanion + union view).SyncedTableManagerhandles UC synced-table registration, Lakeflow pipeline polling, ghost control-plane state recovery, union-view creation, and all downstream Digital Twin build steps — with live progress visible in the app log and Build page..obx(JSON) file with per-domain version-mode selection; import with per-domain Skip / Overwrite / Rename conflict resolution. Format-version gating ensures future backward compatibility.[pitfalls]extra.DatabricksAuthon startup #28) by bumping to 2.7.0; GitPython bumped to 3.1.50 for GHSA-x2qx (CVE-2026-42215 follow-on); Mako ≥ 1.3.12, python-multipart ≥ 0.0.27 retained from v0.3.x.scripts/build_docs.shretained for local on-demand builds.Lakebase GraphDB Engine
graph_engine = "lakebase") alongside LadybugDB. Selected in Settings → Graph DB.lakebase/pool.py).LakebaseFlatStoreimplements DDL (triple table +datatype/langRDF columns), CRUD,VACUUM ANALYZEoptimize, bounded-memorybulk_insert_iter, keyed-paginationiter_triples,_sql_relationoverride for physical vs logical table name handling.GraphDBFactory._create_lakebase): Ladybug and Lakebase are mutually exclusive; engine config validated on save.LAKEBASE_AVAILABLEcapability flag;TripleStoreFactoryincludes Lakebase in availability detection.src/back/core/graphdb/lakebase/schema.sql; autodoc page:docs/sphinx/api/app.core.graphdb.lakebase.rst.App-managed companion layout
app_managedbuilds now use the same 3-object Postgres layout asmanaged_synced:*_sync— bulk warehouse data (streamed by the build pipeline)*__app— companion (reasoning / materialise writes)LakebaseFlatStore.bulk_load_into_sync()for the build pipeline;_writable_table_id()always returns companion (*__app).drop_table()cleans up all 3 objects;optimize_table()vacuums both*_syncand*__app.Settings — Graph DB tab
GET /settings/graph-engine/lakebase-health): usespg_catalogqueries (privilege-independent) and the sameresolve_lakebase_graph_schemalogic as the build pipeline.GET /settings/graph-engineandGET /settings/graph-engine-confignow allowed for all app users (POST remains admin-only).global_config.configundergraph_engine/graph_engine_configand mirrored into the domain registry entry.Schema resolution
resolve_lakebase_graph_schema: explicitgraph_engine_config.schemawins; falls back to Registry Volume schema; thenDEFAULT_GRAPH_SCHEMA(ontobricks_graph).resolve_lakebase_graph_database: explicitgraph_engine_config.databasewins; falls back toRegistryCfg.lakebase_database; then auth default.synced_uc_name) always uses the registry UC schema (RegistryCfg.schema) so the Lakeflow synced object lands in the same Unity Catalog namespace as all other registry artefacts.Managed Sync / Lakeflow Pipeline
SyncedTableManager: handles UC synced-table registration, trigger (full refresh), Lakeflow pipeline polling (wait_for_completionviaget_update(update_id)+ idle-wait fallback),on_state_changecallback for live task context updates._normalize_statestripsSYNCED_TABLE_prefix from SDK enum names so terminal/in-progress sets match correctly._is_ghost_control_plane_statedetects the conflict;ensure()triesDELETEthen re-CREATE, with_b/c/dfallback names if the primary slot is permanently reserved.ensure_synced_union_view()runs after Lakeflow materializes*_sync; schema-qualifies the_syncreference when Postgres and UC schemas differ; drops existing table with same name before creating view.auto-repair: if the union view is absent (crashed previous build),repair_synced_view_if_possiblerecreates it from the existing*_sync/*__appobjects.database_instance_name(project) +database_branch+logical_database_name(required by the Lakebase Synced Tables API for Autoscaling projects).LakebaseAuth.branch_nameproperty parses the branch segment from the PGHOST endpoint resource path.lakebase_synced_uc/lakebase_pipeline_idin task context; frontend polls/dtwin/sync/pipeline-statusevery 6 s; 30-second terminal-OK grace window before build is declared complete.SELECT DISTINCTfix in R2RML-to-Spark SQL templates to prevent duplicate triples causing Lakeflow PK violations.CREATE SCHEMA IF NOT EXISTS) before synced-table registration, withconn.commit()after DDL.Digital Twin Build — UI & UX
database.schema.table; existence badges fordtLakebaseTableExistsanddtLakebaseSyncedUcExists; Lakeflow line showscatalog.schema.<physical_table>_sync.archivestep hidden for Lakebase builds._populate_session_cache): Lakebase path setsgraph_has_data = final_count > 0,graph_engine,registry_archive_applicable = False; LadybugDB path unchanged._tscache timestamp prevents cross-section staleness (previously a shared clock caused "Loaded" badge + "not built" text contradiction)._TS_STATS_CACHE_SCHEMA_VERSION = 2) invalidates old formatted strings on upgrade.local_lbug_exists/local_lbug_pathfield names retired; renamed tograph_has_data/graph_displaythroughout backend, build pipeline, and frontend.Cockpit (Domain Validation)
psDtLakebaseTableExistsandpsDtLakebaseSyncedUcExistsexistence badges.HomeService.dtwin_detailenriched with all lakebase fields (lakebase_table_exists,lakebase_database,lakebase_schema,lakebase_table,lakebase_synced_uc,lakebase_sync_mode).triple_countprefersdt_existenceoverts_statusfor accuracy.Registry OBX Export / Import
src/back/objects/registry/obx_format.py:CURRENT_OBX_FORMAT_VERSION = 1, upgrader-chain pattern,build_envelope(),load()with format-version validation andmin_ontobricks_versiongate.all,active,latest,selected(per-version checkboxes).skip/overwrite/rename. 50 MB upload cap.Ontology Pitfalls Detector
src/back/core/external/pitfalls/subpackage (vendored D2KLab OPD, Apache-2.0):OntologyPatternToolkitwith 19run_p*methods across P1–P4 categories;PitfallsServiceentry point serializes the rdflib Graph to temp TTL and returns grouped results.[pitfalls]extra inpyproject.toml: sentence-transformers, scikit-learn, NLTK, SciPy. ML imports insidetry/exceptso taxonomy constants remain accessible without ML deps.GET /ontology/pitfalls/taxonomy,POST /ontology/pitfalls/analyze(async via TaskManager),GET /ontology/pitfalls/results/{task_id}.scripts/start.shandrequirements.txtupdated to include--extra pitfallsalongside--extra lakebase.HL7 FHIR Industry Ontology
src/back/core/industry/fhir/FhirImportService.py:FHIR_DOMAINScatalog (6 domain groups),_fetch_fhir_ttl(version),_build_allowed_resources()(always includes_FHIR_COMPLEX_TYPES), OWL restriction-based property extraction via_extract_properties_from_restrictions.GET /ontology/fhir-versions; version threaded through fetch, transform, and import.Base → Resource → DomainResource,Base → Element → DataType / BackboneElement / BackboneTypealways injected to avoid dangling parent references.fhir:Resourcebug fixed (parent initialised to"", not"Resource"); non-FHIR-namespacerdfs:subClassOfURIs (w5:, rim:, dc:) skipped during parent resolution.showConfirmDialog(not nativeconfirm()).showConfirmDialogfor Databricks Apps compatibility.Cohort Discovery
CohortBuilder._resolve_predicate: BFS predicate alias map — resolves predicates in ontology namespace (#form) to data namespace (/form), then falls back to alias map by local name for cross-namespace predicates (e.g.ontobricks.com/ontology#hasclaimin a domain with a different base URI). Fixes silent zero-neighbour results for direct inserts and W3C OWL round-trips._outgoing_edge_indexnormalises every triple predicate with_resolve_predicatebefore indexing._dataPropsForClass(classUri)filters attribute dropdown to the hop-target entity's own data properties (was showing all properties).in_frontier === 0guard prevents wrong "no neighbours" blame when the frontier is empty.data/customer/generate_data.py): shared electricity contract pool (40 slots, 35% pool share rate) ensures multi-customer contract nodes exist for cohort edge formation;%s→?parameter placeholder fix for Databricks SQL connector.Knowledge Graph & Inference
build_inference_sqlnow uses BFS traversal (find_connected_vars+order_connected_props) for chained property atoms, matchingbuild_violation_sql. Fixes'' AS objectin multi-hop rules → 0 triples reaching Lakebase.materialize_graphsuccess (count > 0),SigmaGraph.refreshCurrentExpansion()auto-reloads the visible graph; if no filter active, inline hint points to the KG tab. Zero-triples result badge changed from green OK to yellow Warning for non-HTTP(S) URI schemes.data-sg-change="populateFilterEntityTypes"that re-triggered the fetch on selection.TripleStoreBackend.find_seed_subjectsnow treatsfield="any"as union of label-match and URI-match, matching the LadybugDB Cypher implementation.findOntologyProperty(label → name → raw fallback)..sidebar-content:has(#sigmagraph-section.active)strips container padding; section usesheight: 100%.Ontology Designer UX
saveSharedEntity/saveSharedRelationshipcopyexisting.uriinto the updated object before overwriting, preventingprune_mappings_to_ontology_urisfrom treating the mapping as orphaned.mapping-import.jstop-levelgetElementById(...).addEventListener(...)calls changed to optional chaining (?.) to prevent null-dereference crashes in Databricks Apps.resolveNodeId()helper inmapping-design.jstries exact match, case-insensitive match, and URI local-part extraction, fixing missing links after OWL imports with unnormalized parent/domain/range values._entityPanelActiveTab/_relPanelActiveTabremember the last active tab (Details / Attributes / Actions / Constraints / etc.) across entity/relationship selections in the designer right pane.label: prop.label || prop.name; link text uses the label.findOntologyPropertyno longer requires label (label-only guard removed); callers decide the fallback.Settings & Deployment
scripts/deploy.config.shaddsLAKEBASE_GRAPH_PROJECT/BRANCH/DATABASE(separate graph instance) andLAKEBASE_SYNC_SCHEMA(managed_synced schema).scripts/deploy.shcallsbootstrap-lakebase-perms.shfor each of the three schemas (registry, graph, sync).scripts/setup-lakebase.sh(new): creates a Lakebase project viaPOST /api/2.0/database/instances(Synced Tables-compatible API), waits for AVAILABLE, creates the Postgres database, and prints thedb-…segment fordeploy.config.sh. Mandatory for new projects — UI "New project" button uses a different API incompatible with Synced Tables.docs/deployment.md: added Step 0 (new-workspace setup), Step 5.1b,setup-lakebase.shreference in checklists.docs/lakebase-graphdb.md(new): architecture overview, prerequisites, provisioning guide, all config keys, write modes comparison, Postgres schema layout, scripts reference, permissions bootstrap order, Digital Twin build steps, troubleshooting section.settings.sql_warehouse_id(env-injected fromsql-warehousebinding) when global config has not been saved yet.scripts/setup.shandscripts/start.shmigrated fromuv pip install -e ".[lakebase]"touv sync --extra lakebase,pitfalls.CI & Developer Experience
docs:job removed from.github/workflows/ci.yml;docs/sphinx/_build/gitignored and 283 stale artifacts removed from the index. Sources andscripts/build_docs.shretained for local on-demand builds.tests/test_settings_lakebase_tier.pydeleted (private methods removed in earlier refactor).tests/test_build_pipeline_streaming.py: migratedpatch(string)class/module path collisions topatch.objectfor Python 3.9 compatibility.Security
UV_FIND_LINKSuntil proxy indexes it; lock entries reference canonical URLs.config_writer()section param bypasses 3.1.49 patch).scripts/dl-vuln-fix-wheels.sh(temporary proxy workaround) removed once proxy indexes both packages.Code Review Fixes
tests/test_build_pipeline_streaming.py: deleted deadTestStartBackgroundArchiveclass (called removed method, was causing a hard failure).src/api/routers/internal/domain.py:90: fixed{"success": False}for the no-Databricks-client case (unconfigured connection is not an error).src/api/routers/internal/settings.py: import from package (from back.objects.domain import SettingsService) not internal module.src/api/routers/internal/dtwin.py:HomeServiceandDomainpromoted to top-level imports;str | Noneunion syntax replaced withOptional[str]for Python 3.9 compatibility.Upgrade Notes
scripts/setup-lakebase.sh— seedocs/lakebase-graphdb.md §Prerequisites.make bootstrap-lakebaseafter every deploy. If you usemanaged_syncedmode, setLAKEBASE_SYNC_SCHEMAindeploy.config.shand the deploy script will grant the third schema automatically.graph_engine_config.schemaset in Settings → Graph DB, it takes precedence over the Registry Volume schema. Verify your Postgres schema matches what is shown in the Settings → Graph DB health card.local_lbug_exists/local_lbug_pathrenamed: any custom monitoring or integration consuming the/dtwin/sync/inforesponse must switch tograph_has_dataandgraph_display.[pitfalls]optional extra: Pitfalls detection requiresuv sync --extra pitfalls(orpip install ".[pitfalls]"). The panel will show a warning banner if ML deps are absent and only graph-only checks will run.This discussion was created from the release v0.4.0.
Beta Was this translation helpful? Give feedback.
All reactions