Skip to content

rtia: reshape iot_data to Telegraf line-protocol form#1

Open
srmadscience wants to merge 2 commits into
mainfrom
rtia-iot-data-telegraf-shape
Open

rtia: reshape iot_data to Telegraf line-protocol form#1
srmadscience wants to merge 2 commits into
mainfrom
rtia-iot-data-telegraf-shape

Conversation

@srmadscience

Copy link
Copy Markdown
Collaborator

Summary

Restructures rtia.iot_data from flat columns to the Telegraf outputs.cratedb line-protocol shape (hash_id / timestamp / name / tags / fields), so the same table can be populated either by COPY FROM or by Telegraf's crate/postgres output plugin. That plugin can't write a GEO_POINT, so the geo coordinates ride along as plain doubles in fields (geo_lon / geo_lat) and geo_location is a GEO_POINT GENERATED from them; day is GENERATED from timestamp.

Changes

  • sql/rtia_schema_create.sql — new iot_data DDL: tags/fields OBJECTs, generated day + geo_location, PK (hash_id, "timestamp", day), partitioned by day.
  • sql/rtia_first_queries.sql, sql/rtia_advanced_queries.sql — rewrite iot_data column refs to tags['…'] / fields['…']. plants / devices / maintenance_log columns are unchanged.
  • grafana/rtia.json — remap 22 panel queries to the bracket-notation access pattern (string-level edit; rest of the file byte-identical).
  • README.md — document the line-protocol shape and the dual COPY FROM / Telegraf ingestion path.
  • .gitignore — exclude the 500k-row iot_demo_dataset.json (240 MB); its canonical copy lives on S3 where the COPY FROM statements read it.

Data

The rewritten iot_demo_dataset.json is not in this PR (240 MB). The matching new-shape file is already published to the S3 bucket the schema script loads from, so COPY FROM works as-is.

Verification

Validated end-to-end against a CrateDB 6.3.2 cluster:

  • Dropped + reloaded all 6 rtia tables; iot_data loaded 500,000 rows, with geo_location / day correctly materialized from fields.
  • Ran all 37 rtia dashboard queries (Grafana macros substituted) — zero errors. This surfaced and fixed an ORDER BY alias-reference regression in two device-type panels.
  • Smoke-tested the rewritten SQL-script query patterns (tags['…'] / fields['…'] access, tags['metadata_*'], the i.tags['plant_id'] join, WITHIN(geo_location, …)).

🤖 Generated with Claude Code

dwrolfe and others added 2 commits June 12, 2026 14:47
Restructure rtia.iot_data from flat columns to the Telegraf
outputs.cratedb shape (hash_id / timestamp / name / tags / fields),
so the same table can be loaded by COPY FROM or by Telegraf's
crate/postgres plugin. That plugin can't write a GEO_POINT, so the
geo coords ride along as doubles in fields (geo_lon/geo_lat) and
geo_location is GENERATED from them; day is GENERATED from timestamp.

- sql/rtia_schema_create.sql: new iot_data DDL (tags/fields OBJECTs,
  generated day + geo_location, PK (hash_id, timestamp, day),
  partitioned by day).
- sql/rtia_first_queries.sql, sql/rtia_advanced_queries.sql: rewrite
  iot_data column refs to tags['...'] / fields['...']; plants/devices/
  maintenance_log columns unchanged.
- grafana/rtia.json: remap 22 panel queries to the bracket-notation
  access pattern.
- README.md: document the line-protocol shape and dual ingestion path.

The 500k-row iot_demo_dataset.json (240 MB) is not committed; the
canonical copy lives on S3 where the COPY FROM statements read it.

Verified end-to-end against a CrateDB 6.3.2 cluster: drop + reload of
all 6 rtia tables, and all rtia dashboard/SQL queries execute clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
iot_demo_dataset.json is loaded by COPY FROM from its canonical S3
copy, so it doesn't belong in git.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants