Skip to content

feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus#287

Open
eggrollofchaos wants to merge 4 commits into
IBM:mainfrom
HPML6998-S26-Team13:codex/aob-upstream-smart-grid-cut
Open

feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus#287
eggrollofchaos wants to merge 4 commits into
IBM:mainfrom
HPML6998-S26-Team13:codex/aob-upstream-smart-grid-cut

Conversation

@eggrollofchaos
Copy link
Copy Markdown

@eggrollofchaos eggrollofchaos commented May 7, 2026

Summary

Ports the Smart Grid transformer-maintenance domain from the SmartGridBench source project (Columbia University, 2026) into AssetOpsBench as a focused upstream contribution.

This PR adds:

  • Smart Grid MCP servers under src/servers/smart_grid/ for IoT, FMSR/DGA, TSFM/RUL, and work-order workflows.
  • A direct adapter exposing the Smart Grid tools as plain Python callables.
  • 36 canonical Smart Grid scenarios plus 5 negative-check fixtures in the AOB local scenario array convention, with extended evaluator metadata documented in docs/smart_grid_data_provenance.md.
  • SG_DATA_DIR data-provenance documentation and a no-CSV-port policy, so no raw or processed source-project CSV datasets are shipped in this repository.
  • Console script entry points for the four Smart Grid MCP servers.
  • Unit tests for the direct adapter, IEC 60599 DGA classification, JSON-safe divergent ratios, and scenario shape/uniqueness.

Design rationale: nested vs. extended

src/servers/smart_grid/{iot,fmsr,tsfm,wo} shares directory names with the existing top-level src/servers/{iot,fmsr,tsfm,wo} servers. The existing servers are domain-general with backend-specific assumptions:

  • src/servers/iot/main.py — CouchDB-backed telemetry (COUCHDB_URL, IOT_DBNAME).
  • src/servers/fmsr/main.py — chillers/AHUs hardcoded list + LLM fallback.
  • src/servers/tsfm/main.py — IBM Granite TSFM, dataset directory via PATH_TO_DATASETS_DIR.
  • src/servers/wo/main.py — FastMCP server reading from WO_DATA_DIR.

The Smart Grid versions are transformer-dataset-specific with CSV-backed loaders driven by SG_DATA_DIR, IEC 60599 DGA classification helpers, and transformer asset-type fixtures. Extending the existing servers in place would conflate two different backend/asset-type assumptions inside a single module. Sub-namespacing under src/servers/smart_grid/ keeps the existing servers untouched and the transformer-specific code locally coherent. Open to splitting these into separate top-level server names (smart_grid_iot, etc.) or merging into the existing servers if maintainers prefer.

Why this is scoped narrowly

This is the first maintainer-readable upstream cut. It intentionally excludes the source-project fork's evaluation-adapter parity work, orchestration runners, batch-mode runner changes, generated-scenario expansion beyond the validated 36, benchmark result artifacts, and course planning/report/deck materials. Those can be proposed separately if useful after the domain/scenario port is reviewed.

Size and split offer

This PR is +3102/-3 lines, larger than the "<300 changed lines" preference in CONTRIBUTING.md. ~950 lines are scenario JSON fixtures (src/scenarios/local/smart_grid.json + smart_grid_negative_checks.json), not code. If a smaller first review is preferred, this can be split into:

  • PR1: scenarios + docs/smart_grid_data_provenance.md + README touch (data-only, ~1100 lines).
  • PR2: src/servers/smart_grid/ MCP servers + direct adapter + tests + pyproject.toml console-script entries (~2000 lines).

Happy to do that if a maintainer asks.

Data policy

No raw or processed SmartGridBench CSV files are included. The servers read synthetic CSVs from SG_DATA_DIR at runtime. The provenance doc explains the expected file layout, the generator source, the Smart Grid scenario metadata fields, and why the source-project processed CSVs are not copied into AssetOpsBench.

Validation

  • uv run pytest src/servers/smart_grid/ — 25 passed.
  • uv run ruff format --check src/servers/smart_grid/ — clean.
  • uv run ruff check src/servers/smart_grid/ — clean.
  • src/scenarios/local/smart_grid.json contains 36 unique canonical records.
  • src/scenarios/local/smart_grid_negative_checks.json contains 5 unique negative fixtures.

Acknowledgments

Source-project authors (Columbia SmartGridBench, Spring 2026): Akshat Bhandari, Aaron Fan, Tanisha Rathod, Wei Alexander Xin.

References

Copy link
Copy Markdown
Author

@eggrollofchaos eggrollofchaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review

Upstream-context pass after opening. Findings + fixes I'm applying in the next push.

High

  • DCO sign-off missing on ec00c1ac (CI ACTION_REQUIRED). Amending with --signoff.
  • PR title + commit headline don't follow Conventional Commits. Updating both to feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus.

Medium

  • src/servers/smart_grid/{fmsr,iot,tsfm,wo} shares directory names with the existing top-level src/servers/{fmsr,iot,tsfm,wo} servers. Adding design rationale to the PR body so this isn't re-derived: existing servers are domain-general with backend-specific assumptions (CouchDB-backed IoT, chillers/AHUs FMSR, etc.); these are transformer-dataset-specific with CSV-backed loaders. Sub-namespacing keeps both designs coherent. Open to splitting/renaming if maintainers prefer.
  • PR size +3102/-3 vs the "<300 changed lines" preference; ~950 of those are scenario JSON fixtures, not code. Adding a split offer to the PR body (scenarios + data provenance as PR1, servers + adapter + tests as PR2) if a smaller first review is preferred.

Low

  • Branch name codex/aob-upstream-smart-grid-cut doesn't match the <type>/<description> convention. Renaming closes the PR; flagging as accepted exception.

Verification on src/servers/smart_grid/: uv run ruff format --check clean, uv run ruff check clean, uv run pytest src/servers/smart_grid/ 25 passed. Pre-existing repo-wide ruff issues elsewhere are out of this PR's scope.

Pushing the amend + PR-title/body refresh next.

@eggrollofchaos eggrollofchaos changed the title Add Smart Grid transformer domain and scenarios feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus May 10, 2026
…rio corpus

Adds the Smart Grid transformer-maintenance domain to AssetOpsBench as a
focused upstream cut from the SmartGridBench source project (Columbia
University, 2026). New surfaces:

- Smart Grid MCP servers under `src/servers/smart_grid/` for IoT, FMSR/DGA,
  TSFM/RUL, and work-order workflows. Nested under a domain-specific
  sub-namespace to coexist with the existing domain-general
  `src/servers/{iot,fmsr,tsfm,wo}` servers (different backends, asset
  types, and data assumptions; PR body documents the design rationale).
- A direct adapter exposing the Smart Grid tools as plain Python callables.
- 36 canonical Smart Grid scenarios + 5 negative-check fixtures in the AOB
  local scenario array convention; extended evaluator metadata documented
  in `docs/smart_grid_data_provenance.md`.
- `SG_DATA_DIR` runtime data-provenance contract and a no-CSV-port policy:
  no raw or processed source-project CSV datasets are shipped.
- Console-script entry points for the four Smart Grid MCP servers.
- Unit tests for the direct adapter, IEC 60599 DGA classification,
  JSON-safe divergent ratios, and scenario shape/uniqueness.

Validation: uv run pytest src/servers/smart_grid/ -- 25 passed.
Scenario JSON contains 36 unique canonical records and 5 unique
negative-check records.

Refs: HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp#46

Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
@eggrollofchaos eggrollofchaos force-pushed the codex/aob-upstream-smart-grid-cut branch from ec00c1a to a5b35a9 Compare May 10, 2026 01:55
@eggrollofchaos
Copy link
Copy Markdown
Author

Status update — addressed at a5b35a99

All findings from the self-review above are now in:

  • DCO H1 — addressed. Commit ec00c1aca5b35a99 via git commit --amend --signoff (single commit, identity unchanged: Wei Alexander Xin <eggrollofchaos@gmail.com>). DCO check now SUCCESS.
  • Conventional Commits H2 — addressed. PR title is now feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus; commit headline matches.
  • Nested-vs-extended M1 — addressed. PR body now has a ## Design rationale: nested vs. extended section enumerating each existing top-level server's backend assumptions vs. the Smart Grid versions, and offers to merge or rename if maintainers prefer.
  • Size + split offer M2 — addressed. PR body has a ## Size and split offer section noting the +3102/-3 size, the JSON-fixture share of additions, and a concrete two-PR split (scenarios + provenance / servers + adapter + tests) if a smaller first review is preferred.
  • Branch-name L1 — flagged as accepted exception (renaming would close this PR per GitHub semantics).

Ready for maintainer review. mergeStateStatus=BLOCKED and reviewDecision=REVIEW_REQUIRED remain pending an approving review from a reviewer with write access to IBM/AssetOpsBench, which I cannot satisfy from this side.

Copy link
Copy Markdown
Author

@eggrollofchaos eggrollofchaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review follow-up

PR context checked: head a5b35a9; top-level comments 1; inline comments 0; review threads 0 (0 active); latest review/comment 2026-05-10T01:56:40Z.

High

  • src/servers/smart_grid/fmsr/main.py: get_dga_record() returns sample_date as a pandas Timestamp, so a valid DGA lookup fails strict JSON serialization over MCP (TypeError: Object of type Timestamp is not JSON serializable). Fix: normalize date-like values at the response boundary and add a regression test that exercises get_dga_record() against a temp SG_DATA_DIR fixture with json.dumps(..., allow_nan=False).

Medium / Low

  • No new blockers found. DCO is green, the PR title/commit are Conventional Commits, and the branch-name mismatch remains an accepted exception because renaming the open PR branch would close the PR.

Validation before the fix: uv run pytest src/servers/smart_grid/ passes, ruff passes, and a live JSON-serialization smoke test with source-project processed CSVs isolates the failure to fmsr.get_dga_record().

Normalize pandas Timestamp values from get_dga_record before returning them through MCP JSON-RPC. Without this, valid DGA lookups over parsed CSV fixtures fail strict JSON serialization even though the in-process Python call succeeds.

Adds a regression test that builds a temporary SG_DATA_DIR fixture, calls get_dga_record, and verifies json.dumps(..., allow_nan=False) succeeds.

This is a follow-up from PR IBM#287 self-review and should remain a separate review-iteration commit on the published branch.

Validation:
- SG_DATA_DIR=/Users/wax/coding/hpml-assetopsbench-smart-grid-mcp/data/processed uv run pytest src/servers/smart_grid/
- uv run ruff format --check src/servers/smart_grid/
- uv run ruff check src/servers/smart_grid/
- SG_DATA_DIR=/Users/wax/coding/hpml-assetopsbench-smart-grid-mcp/data/processed uv run python <19-tool JSON serialization smoke>

Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
@eggrollofchaos
Copy link
Copy Markdown
Author

Status update — addressed at 3fb6943

The self-review follow-up finding is now fixed on the PR branch.

  • DGA JSON serialization H1 — addressed. get_dga_record() now normalizes pandas Timestamp values before returning records through MCP JSON-RPC.
  • Added a regression test with a temporary SG_DATA_DIR fixture that calls get_dga_record() and verifies json.dumps(..., allow_nan=False) succeeds.
  • DCO check is green on the new head.

Validation after the fix:

  • SG_DATA_DIR=/Users/wax/coding/hpml-assetopsbench-smart-grid-mcp/data/processed uv run pytest src/servers/smart_grid/ — 26 passed.
  • uv run ruff format --check src/servers/smart_grid/ — clean.
  • uv run ruff check src/servers/smart_grid/ — clean.
  • 19-tool Smart Grid JSON-serialization smoke against source-project processed CSVs — passed.

Copy link
Copy Markdown
Author

@eggrollofchaos eggrollofchaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review follow-up #2

PR context checked: head 3fb6943; top-level comments 2; inline comments 0; review threads 0 (0 active); latest review/comment 2026-05-10T03:54:51Z. DCO SUCCESS. mergeStateStatus=BLOCKED, reviewDecision=REVIEW_REQUIRED (gated on IBM-maintainer approval).

Took another adversarial pass over the full PR diff (a928284b..3fb6943) and ran a live JSON-safety smoke across every @mcp.tool() in src/servers/smart_grid/. Codex's H1 fix in fmsr.get_dga_record is correct + complete; the regression test is hermetic (uses tmp_path + monkeypatch against the verified module-level _dga_records: pd.DataFrame | None = None cache).

Verification at 3fb6943

  • uv run ruff format --check src/servers/smart_grid/ — clean.
  • uv run ruff check src/servers/smart_grid/ — clean.
  • SG_DATA_DIR=... uv run pytest src/servers/smart_grid/ — 26 passed.
  • Live json.dumps(result, allow_nan=False) smoke against all 11 read-path/create-path tools (iot.{list_assets, get_asset_metadata, list_sensors, get_sensor_readings}, fmsr.{get_dga_record, list_failure_modes, analyze_dga}, tsfm.{get_rul, forecast_rul, detect_anomalies, trend_analysis}, wo.{list_fault_records, get_fault_record, create_work_order}) against the source-project processed CSVs — every tool serializes strictly. Confirms the Codex H1 fix and verifies no sister bug at the boundary today.

Critical: 0 / High: 0 / Medium: 0 / Low: 2

Low

  • L1 — src/servers/smart_grid/wo/main.py:53-54 _normalize_record is the pre-fix pd.isna-only pattern. It's safe at this commit because base.load_fault_records doesn't pass parse_dates, so no pd.Timestamp ever reaches the dict. It's the same pattern that broke fmsr.get_dga_record once parse_dates=["sample_date"] was set. A future change adding parse_dates to fault records (e.g., a report_date column) would silently break JSON-RPC the same way. Defense-in-depth fix: promote fmsr._json_safe_record to src/servers/smart_grid/base.py as a public helper and replace wo._normalize_record so all four servers share one canonical boundary normalizer. Not a blocker; flagging for forward safety.
  • L2 — Project-wide JSON-safety smoke is missing. Codex's regression test covers get_dga_record only. Cheap insurance: a parametrized test that walks every @mcp.tool()-decorated callable in servers.smart_grid.*.main and asserts json.dumps(result, allow_nan=False) succeeds against a small fixture SG_DATA_DIR. Same tmp_path + monkeypatch.setattr(_module_cache, None) pattern Codex's new test established. Catches any future regression of the L1 class without per-tool test boilerplate.

Verdict: LGTM. Merge blocker: none. L1 + L2 are defensive-design suggestions; both can ride a follow-up PR if the maintainer prefers a tightly-scoped first review here.

Move the JSON-safe record normalizer from `fmsr/main.py` (where it was added in
`fix(smart_grid): serialize DGA sample dates`) up to `base.py` as the
public canonical helper `json_safe_record`. Replace the latent pre-fix
`_normalize_record` in `wo/main.py` (which only handled `pd.isna`, not
`pd.Timestamp`) with the canonical helper.

`wo._normalize_record` was correct in behavior at the time it ran because
`load_fault_records` does not currently pass `parse_dates`, so no
`pd.Timestamp` ever leaked through. Adding `parse_dates=["report_date"]` (or
similar) later would have silently broken JSON-RPC the same way the DGA path
broke before its fix. Centralizing the boundary normalizer prevents that
regression class.

Verification: `uv run pytest src/servers/smart_grid/` -- 42 passed.
`uv run ruff format --check src/servers/smart_grid/` clean.
`uv run ruff check src/servers/smart_grid/` clean.

Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
Add `tests/test_json_safety.py` that walks every `@mcp.tool()`-decorated
callable across `iot`, `fmsr`, `tsfm`, and `wo` and asserts
`json.dumps(result, allow_nan=False)` succeeds against a hermetic
`SG_DATA_DIR` fixture. Catches the boundary-contract bug class fixed in
`fmsr.get_dga_record` for any current or future Smart Grid tool, without
per-tool test boilerplate.

The fixture writes minimal CSVs for all six processed-data files, sets
`SG_DATA_DIR` to a `tmp_path`, and resets module-level dataframe caches
across all four servers so each test gets a clean read path.

16 parametrized cases land 42 total tests passing (was 26).

Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
@eggrollofchaos
Copy link
Copy Markdown
Author

Status update — L1 + L2 addressed at c5067b9

Self-review follow-up #2's two Low findings are now in. PR head: c5067b9. DCO check SUCCESS.

  • L1 — wo._normalize_record latent landmine — addressed in e8b3ab0 (refactor(smart_grid): unify JSON-safe boundary helper in base.py). The Codex-introduced _json_safe_record is promoted from fmsr/main.py to base.py as the public json_safe_record helper. Both fmsr and wo now route through it. A future change adding parse_dates to load_fault_records no longer carries silent JSON-RPC failure risk.
  • L2 — Project-wide JSON-safety smoke — addressed in c5067b9 (test(smart_grid): add tool-level JSON-safety parametrized smoke). New tests/test_json_safety.py walks every @mcp.tool() across all four Smart Grid servers (16 parametrized cases) against a hermetic SG_DATA_DIR fixture and asserts json.dumps(result, allow_nan=False) succeeds. Boundary-contract regression detector for any current or future tool.

Verification at c5067b9

  • uv run pytest src/servers/smart_grid/42 passed (was 26; +16 parametrized cases).
  • uv run ruff format --check src/servers/smart_grid/ — clean.
  • uv run ruff check src/servers/smart_grid/ — clean.

The PR now stacks feat → fix → refactor → test, all Conventional Commits, all signed off; squash-merge collapses to the PR-title commit on main. Ready for IBM maintainer review.

Copy link
Copy Markdown
Author

@eggrollofchaos eggrollofchaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review follow-up #3

PR context checked at head c5067b910ff6b333ecf345cd917676ca39cdef1f; top-level comments 3 (status updates a5b35a99, 3fb6943, c5067b9); review records 3 prior (Self-review at ec00c1ac, Self-review follow-up at a5b35a9, Self-review follow-up #2 at 3fb6943); inline comments 0; review threads 0; DCO SUCCESS at 2026-05-10T04:50:46Z; all 4 commits signed off; mergeStateStatus: BLOCKED, reviewDecision: REVIEW_REQUIRED (IBM-maintainer gates, expected).

Final-confirmation pass at c5067b9 to close the loop after the L1 + L2 fix commits (e8b3ab0 + c5067b9).

Self-review follow-up #2 closure

  • L1 wo._normalize_record latent landmine → closed at e8b3ab0. wo/main.py no longer defines the local _normalize_record; both list_fault_records (wo/main.py:122) and get_fault_record (wo/main.py:139) now call json_safe_record from base.py. fmsr/main.py:280 (get_dga_record) also routes through the canonical helper. The base.json_safe_record function (base.py:161) is the single public boundary normalizer for all four servers; any future parse_dates=[...] addition to load_fault_records or load_dga_records cannot regress JSON-RPC serialization the way the original DGA path did.
  • L2 no project-wide JSON-safety smoke → closed at c5067b9. New tests/test_json_safety.py (5945 bytes) walks every @mcp.tool() decorated callable across iot, fmsr, tsfm, and wo and asserts json.dumps(result, allow_nan=False) succeeds against a hermetic SG_DATA_DIR fixture. Catches the boundary-contract bug class for any current or future Smart Grid tool, without per-tool boilerplate. 16 parametrized cases; total smart_grid tests went 26 → 42.

Probed and ruled out

  • json_safe_record is now the only public boundary normalizer; no shadow copies of _normalize_record / _json_safe_record remain anywhere under src/servers/smart_grid/. Verified by grep -n "_normalize_record\|json_safe_record" across base.py, fmsr/main.py, wo/main.py.
  • The test_json_safety.py fixture resets module-level dataframe caches across iot._metadata, iot._readings, fmsr._failure_modes, fmsr._dga_records, tsfm._rul, tsfm._readings, wo._fault_records, wo._asset_metadata before each test. Hermetic — no SG_DATA_DIR pollution between cases.
  • No new inline comments, review threads, or third-party/bot findings since follow-up #2.
  • Commit chain unchanged from follow-up #2 plus e8b3ab0 + c5067b9: a5b35a93fb6943e8b3ab0c5067b9. All signed off.
  • PR body unchanged; design rationale + size/split offer + acknowledgments paragraph still present.
  • DCO check stayed SUCCESS across both new commits.

Verification at c5067b9

  • SG_DATA_DIR=… uv run pytest src/servers/smart_grid/ -q → 42 passed.
  • uv run ruff format --check src/servers/smart_grid/ → 16 files already formatted.
  • uv run ruff check src/servers/smart_grid/ → all checks passed.

Summary counts

  • Critical: 0
  • High: 0
  • Medium: 0
  • Low: 0
  • Nit: 0 (L1 + L2 both closed)

Verdict

LGTM — final-confirmation clean at head c5067b9. All review findings across the four passes closed in-PR: H1 DCO sign-off (v1 → a5b35a9), H1 Conventional Commits title (v1 → a5b35a9), M1 nested-vs-extended design rationale (v1 → PR body), H1 DGA Timestamp serialization (v2 → 3fb6943), L1 wo._normalize_record latent landmine (v3 → e8b3ab0), L2 missing project-wide JSON-safety smoke (v3 → c5067b9). Remaining gate is purely IBM-maintainer external review.

@DhavalRepo18 DhavalRepo18 self-requested a review May 11, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants