You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ALTER COLLECTION … ADD COLUMN … on a TYPE DOCUMENT STRICT collection updates the catalog schema (schema.columns.push(...), schema.version += 1) but never touches existing row data. Pre-ALTER rows were written with the old column set; post-ALTER reads run against the new schema. The result depends on how the reader handles version drift, but in practice reads of the old rows return "null-everywhere" row objects (columns populated in the row are invisible because the schema-vs-data offset is wrong) or silently drop the new column from results.
No error is raised. No warning is logged. The only known recovery is to DROP COLLECTION and recreate — which loses all data.
pubasyncfnalter_table_add_column(state:&SharedState,identity:&AuthenticatedIdentity,parts:&[&str],sql:&str,) -> PgWireResult<Vec<Response>>{
...// Validate: new column must be nullable or have a default.if !column.nullable && column.default.is_none(){returnErr(sqlstate_error("42601","… must have a DEFAULT"));}ifletSome(catalog) = state.credentials.catalog(){match catalog.get_collection(tenant_id.as_u32(),&table_name){Ok(Some(coll))if coll.is_active => {if coll.collection_type.is_strict()
&& letSome(config_json) = &coll.timeseries_config
&& letOk(mut schema) =
sonic_rs::from_str::<nodedb_types::columnar::StrictSchema>(config_json){if schema.columns.iter().any(|c| c.name == column.name){returnErr(sqlstate_error("42P07","column already exists"));}
schema.columns.push(column);
schema.version = schema.version.saturating_add(1);letmut updated = coll;
updated.collection_type = nodedb_types::CollectionType::strict(schema.clone());
updated.timeseries_config = sonic_rs::to_string(&schema).ok();let entry = crate::control::catalog_entry::CatalogEntry::PutCollection(Box::new(updated.clone()),);let log_index =
crate::control::metadata_proposer::propose_catalog_entry(state,&entry).map_err(|e| sqlstate_error("XX000",&e.to_string()))?;if log_index == 0{
catalog.put_collection(&updated).map_err(|e| sqlstate_error("XX000",&e.to_string()))?;}}}
_ => {returnErr(sqlstate_error("42P01","collection does not exist"));}}}
...
Ok(vec![Response::Execution(Tag::new("ALTER TABLE"))])}
The function:
Validates the new column (nullable, or has a DEFAULT).
Bumps schema.columns + schema.version in the catalog.
Proposes the new catalog entry via Raft.
Returns success.
There is no iteration over existing rows, no backfill of the DEFAULT, no rewrite of the on-disk row format, and no compatibility shim in the reader that knows to return the DEFAULT for rows serialized under an older schema version. schema.version is bumped (line 66) but I could not find any reader path that reads schema.version and reconciles against a row's written-under version — the row format appears to assume the catalog schema always matches the bytes.
Postgres ALTER TABLE … ADD COLUMN … DEFAULT … either backfills (rewriting the table) or records a non-null default in pg_attribute and evaluates it virtually at read time. NodeDB does neither.
Why this matters
Standard additive migration pattern (ALTER COLLECTION t ADD COLUMN new_col …) is the documented way to evolve a schema. Users doing this against a collection with data will silently corrupt reads until they realise every SELECT returns null-everywhere rows.
The only recovery surfaced downstream is DROP COLLECTION + CREATE COLLECTION + re-ingest — data loss.
schema.version = schema.version.saturating_add(1) at line 66 suggests the data model expected versioning to be implemented, but the read path was never wired up; this is a half-built feature that in its current state silently loses data.
Repro
CREATE COLLECTION t TYPE DOCUMENT STRICT (
id STRING PRIMARY KEY,
name STRING
);
INSERT INTO t (id, name) VALUES ('a', 'alice');
SELECT id, name FROM t;
-- expected: [{ id: 'a', name: 'alice' }]-- observed: [{ id: 'a', name: 'alice' }] ✓
ALTER COLLECTION t ADD COLUMN note STRING DEFAULT 'n/a';
SELECT id, name, note FROM t;
-- expected: [{ id: 'a', name: 'alice', note: 'n/a' }]-- observed (known bad): [{ id: null, name: null, note: null }]-- or [{}] / row returns but all columns absent
Notes
Found while auditing DDL handlers + reading the schema-evolution path end-to-end.
The only known workaround is to pre-declare all columns at CREATE COLLECTION time and never ALTER ADD COLUMN on a collection that has data — severely limiting live-migration patterns.
ALTER COLLECTION … ADD COLUMN …on aTYPE DOCUMENT STRICTcollection updates the catalog schema (schema.columns.push(...),schema.version += 1) but never touches existing row data. Pre-ALTER rows were written with the old column set; post-ALTER reads run against the new schema. The result depends on how the reader handles version drift, but in practice reads of the old rows return "null-everywhere" row objects (columns populated in the row are invisible because the schema-vs-data offset is wrong) or silently drop the new column from results.No error is raised. No warning is logged. The only known recovery is to
DROP COLLECTIONand recreate — which loses all data.Current code
nodedb/src/control/server/pgwire/ddl/collection/alter/add_column.rs:15-103:The function:
schema.columns+schema.versionin the catalog.There is no iteration over existing rows, no backfill of the DEFAULT, no rewrite of the on-disk row format, and no compatibility shim in the reader that knows to return the DEFAULT for rows serialized under an older schema version.
schema.versionis bumped (line 66) but I could not find any reader path that readsschema.versionand reconciles against a row's written-under version — the row format appears to assume the catalog schema always matches the bytes.Postgres
ALTER TABLE … ADD COLUMN … DEFAULT …either backfills (rewriting the table) or records a non-null default inpg_attributeand evaluates it virtually at read time. NodeDB does neither.Why this matters
ALTER COLLECTION t ADD COLUMN new_col …) is the documented way to evolve a schema. Users doing this against a collection with data will silently corrupt reads until they realise everySELECTreturns null-everywhere rows./query, native client).DROP COLLECTION+CREATE COLLECTION+ re-ingest — data loss.schema.version = schema.version.saturating_add(1)at line 66 suggests the data model expected versioning to be implemented, but the read path was never wired up; this is a half-built feature that in its current state silently loses data.Repro
Notes
CREATE COLLECTIONtime and neverALTER ADD COLUMNon a collection that has data — severely limiting live-migration patterns.