feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus#287
feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus#287eggrollofchaos wants to merge 4 commits into
Conversation
There was a problem hiding this comment.
Self-review
Upstream-context pass after opening. Findings + fixes I'm applying in the next push.
High
- DCO sign-off missing on
ec00c1ac(CIACTION_REQUIRED). Amending with--signoff. - PR title + commit headline don't follow Conventional Commits. Updating both to
feat(smart_grid): add Smart Grid transformer MCP servers and 36-scenario corpus.
Medium
src/servers/smart_grid/{fmsr,iot,tsfm,wo}shares directory names with the existing top-levelsrc/servers/{fmsr,iot,tsfm,wo}servers. Adding design rationale to the PR body so this isn't re-derived: existing servers are domain-general with backend-specific assumptions (CouchDB-backed IoT, chillers/AHUs FMSR, etc.); these are transformer-dataset-specific with CSV-backed loaders. Sub-namespacing keeps both designs coherent. Open to splitting/renaming if maintainers prefer.- PR size +3102/-3 vs the "<300 changed lines" preference; ~950 of those are scenario JSON fixtures, not code. Adding a split offer to the PR body (scenarios + data provenance as PR1, servers + adapter + tests as PR2) if a smaller first review is preferred.
Low
- Branch name
codex/aob-upstream-smart-grid-cutdoesn't match the<type>/<description>convention. Renaming closes the PR; flagging as accepted exception.
Verification on src/servers/smart_grid/: uv run ruff format --check clean, uv run ruff check clean, uv run pytest src/servers/smart_grid/ 25 passed. Pre-existing repo-wide ruff issues elsewhere are out of this PR's scope.
Pushing the amend + PR-title/body refresh next.
…rio corpus
Adds the Smart Grid transformer-maintenance domain to AssetOpsBench as a
focused upstream cut from the SmartGridBench source project (Columbia
University, 2026). New surfaces:
- Smart Grid MCP servers under `src/servers/smart_grid/` for IoT, FMSR/DGA,
TSFM/RUL, and work-order workflows. Nested under a domain-specific
sub-namespace to coexist with the existing domain-general
`src/servers/{iot,fmsr,tsfm,wo}` servers (different backends, asset
types, and data assumptions; PR body documents the design rationale).
- A direct adapter exposing the Smart Grid tools as plain Python callables.
- 36 canonical Smart Grid scenarios + 5 negative-check fixtures in the AOB
local scenario array convention; extended evaluator metadata documented
in `docs/smart_grid_data_provenance.md`.
- `SG_DATA_DIR` runtime data-provenance contract and a no-CSV-port policy:
no raw or processed source-project CSV datasets are shipped.
- Console-script entry points for the four Smart Grid MCP servers.
- Unit tests for the direct adapter, IEC 60599 DGA classification,
JSON-safe divergent ratios, and scenario shape/uniqueness.
Validation: uv run pytest src/servers/smart_grid/ -- 25 passed.
Scenario JSON contains 36 unique canonical records and 5 unique
negative-check records.
Refs: HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp#46
Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
ec00c1a to
a5b35a9
Compare
Status update — addressed at
|
eggrollofchaos
left a comment
There was a problem hiding this comment.
Self-review follow-up
PR context checked: head a5b35a9; top-level comments 1; inline comments 0; review threads 0 (0 active); latest review/comment 2026-05-10T01:56:40Z.
High
src/servers/smart_grid/fmsr/main.py:get_dga_record()returnssample_dateas a pandasTimestamp, so a valid DGA lookup fails strict JSON serialization over MCP (TypeError: Object of type Timestamp is not JSON serializable). Fix: normalize date-like values at the response boundary and add a regression test that exercisesget_dga_record()against a tempSG_DATA_DIRfixture withjson.dumps(..., allow_nan=False).
Medium / Low
- No new blockers found. DCO is green, the PR title/commit are Conventional Commits, and the branch-name mismatch remains an accepted exception because renaming the open PR branch would close the PR.
Validation before the fix: uv run pytest src/servers/smart_grid/ passes, ruff passes, and a live JSON-serialization smoke test with source-project processed CSVs isolates the failure to fmsr.get_dga_record().
Normalize pandas Timestamp values from get_dga_record before returning them through MCP JSON-RPC. Without this, valid DGA lookups over parsed CSV fixtures fail strict JSON serialization even though the in-process Python call succeeds. Adds a regression test that builds a temporary SG_DATA_DIR fixture, calls get_dga_record, and verifies json.dumps(..., allow_nan=False) succeeds. This is a follow-up from PR IBM#287 self-review and should remain a separate review-iteration commit on the published branch. Validation: - SG_DATA_DIR=/Users/wax/coding/hpml-assetopsbench-smart-grid-mcp/data/processed uv run pytest src/servers/smart_grid/ - uv run ruff format --check src/servers/smart_grid/ - uv run ruff check src/servers/smart_grid/ - SG_DATA_DIR=/Users/wax/coding/hpml-assetopsbench-smart-grid-mcp/data/processed uv run python <19-tool JSON serialization smoke> Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
Status update — addressed at
|
eggrollofchaos
left a comment
There was a problem hiding this comment.
Self-review follow-up #2
PR context checked: head 3fb6943; top-level comments 2; inline comments 0; review threads 0 (0 active); latest review/comment 2026-05-10T03:54:51Z. DCO SUCCESS. mergeStateStatus=BLOCKED, reviewDecision=REVIEW_REQUIRED (gated on IBM-maintainer approval).
Took another adversarial pass over the full PR diff (a928284b..3fb6943) and ran a live JSON-safety smoke across every @mcp.tool() in src/servers/smart_grid/. Codex's H1 fix in fmsr.get_dga_record is correct + complete; the regression test is hermetic (uses tmp_path + monkeypatch against the verified module-level _dga_records: pd.DataFrame | None = None cache).
Verification at 3fb6943
uv run ruff format --check src/servers/smart_grid/— clean.uv run ruff check src/servers/smart_grid/— clean.SG_DATA_DIR=... uv run pytest src/servers/smart_grid/— 26 passed.- Live
json.dumps(result, allow_nan=False)smoke against all 11 read-path/create-path tools (iot.{list_assets, get_asset_metadata, list_sensors, get_sensor_readings},fmsr.{get_dga_record, list_failure_modes, analyze_dga},tsfm.{get_rul, forecast_rul, detect_anomalies, trend_analysis},wo.{list_fault_records, get_fault_record, create_work_order}) against the source-project processed CSVs — every tool serializes strictly. Confirms the Codex H1 fix and verifies no sister bug at the boundary today.
Critical: 0 / High: 0 / Medium: 0 / Low: 2
Low
- L1 —
src/servers/smart_grid/wo/main.py:53-54_normalize_recordis the pre-fixpd.isna-only pattern. It's safe at this commit becausebase.load_fault_recordsdoesn't passparse_dates, so nopd.Timestampever reaches the dict. It's the same pattern that brokefmsr.get_dga_recordonceparse_dates=["sample_date"]was set. A future change addingparse_datesto fault records (e.g., areport_datecolumn) would silently break JSON-RPC the same way. Defense-in-depth fix: promotefmsr._json_safe_recordtosrc/servers/smart_grid/base.pyas a public helper and replacewo._normalize_recordso all four servers share one canonical boundary normalizer. Not a blocker; flagging for forward safety. - L2 — Project-wide JSON-safety smoke is missing. Codex's regression test covers
get_dga_recordonly. Cheap insurance: a parametrized test that walks every@mcp.tool()-decorated callable inservers.smart_grid.*.mainand assertsjson.dumps(result, allow_nan=False)succeeds against a small fixtureSG_DATA_DIR. Sametmp_path+monkeypatch.setattr(_module_cache, None)pattern Codex's new test established. Catches any future regression of the L1 class without per-tool test boilerplate.
Verdict: LGTM. Merge blocker: none. L1 + L2 are defensive-design suggestions; both can ride a follow-up PR if the maintainer prefers a tightly-scoped first review here.
Move the JSON-safe record normalizer from `fmsr/main.py` (where it was added in `fix(smart_grid): serialize DGA sample dates`) up to `base.py` as the public canonical helper `json_safe_record`. Replace the latent pre-fix `_normalize_record` in `wo/main.py` (which only handled `pd.isna`, not `pd.Timestamp`) with the canonical helper. `wo._normalize_record` was correct in behavior at the time it ran because `load_fault_records` does not currently pass `parse_dates`, so no `pd.Timestamp` ever leaked through. Adding `parse_dates=["report_date"]` (or similar) later would have silently broken JSON-RPC the same way the DGA path broke before its fix. Centralizing the boundary normalizer prevents that regression class. Verification: `uv run pytest src/servers/smart_grid/` -- 42 passed. `uv run ruff format --check src/servers/smart_grid/` clean. `uv run ruff check src/servers/smart_grid/` clean. Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
Add `tests/test_json_safety.py` that walks every `@mcp.tool()`-decorated callable across `iot`, `fmsr`, `tsfm`, and `wo` and asserts `json.dumps(result, allow_nan=False)` succeeds against a hermetic `SG_DATA_DIR` fixture. Catches the boundary-contract bug class fixed in `fmsr.get_dga_record` for any current or future Smart Grid tool, without per-tool test boilerplate. The fixture writes minimal CSVs for all six processed-data files, sets `SG_DATA_DIR` to a `tmp_path`, and resets module-level dataframe caches across all four servers so each test gets a clean read path. 16 parametrized cases land 42 total tests passing (was 26). Signed-off-by: Wei Alexander Xin <eggrollofchaos@gmail.com>
Status update — L1 + L2 addressed at
|
eggrollofchaos
left a comment
There was a problem hiding this comment.
Self-review follow-up #3
PR context checked at head c5067b910ff6b333ecf345cd917676ca39cdef1f; top-level comments 3 (status updates a5b35a99, 3fb6943, c5067b9); review records 3 prior (Self-review at ec00c1ac, Self-review follow-up at a5b35a9, Self-review follow-up #2 at 3fb6943); inline comments 0; review threads 0; DCO SUCCESS at 2026-05-10T04:50:46Z; all 4 commits signed off; mergeStateStatus: BLOCKED, reviewDecision: REVIEW_REQUIRED (IBM-maintainer gates, expected).
Final-confirmation pass at c5067b9 to close the loop after the L1 + L2 fix commits (e8b3ab0 + c5067b9).
Self-review follow-up #2 closure
- L1
wo._normalize_recordlatent landmine → closed ate8b3ab0.wo/main.pyno longer defines the local_normalize_record; bothlist_fault_records(wo/main.py:122) andget_fault_record(wo/main.py:139) now calljson_safe_recordfrombase.py.fmsr/main.py:280(get_dga_record) also routes through the canonical helper. Thebase.json_safe_recordfunction (base.py:161) is the single public boundary normalizer for all four servers; any futureparse_dates=[...]addition toload_fault_recordsorload_dga_recordscannot regress JSON-RPC serialization the way the original DGA path did. - L2 no project-wide JSON-safety smoke → closed at
c5067b9. Newtests/test_json_safety.py(5945 bytes) walks every@mcp.tool()decorated callable acrossiot,fmsr,tsfm, andwoand assertsjson.dumps(result, allow_nan=False)succeeds against a hermeticSG_DATA_DIRfixture. Catches the boundary-contract bug class for any current or future Smart Grid tool, without per-tool boilerplate. 16 parametrized cases; total smart_grid tests went 26 → 42.
Probed and ruled out
json_safe_recordis now the only public boundary normalizer; no shadow copies of_normalize_record/_json_safe_recordremain anywhere undersrc/servers/smart_grid/. Verified bygrep -n "_normalize_record\|json_safe_record"acrossbase.py,fmsr/main.py,wo/main.py.- The
test_json_safety.pyfixture resets module-level dataframe caches acrossiot._metadata,iot._readings,fmsr._failure_modes,fmsr._dga_records,tsfm._rul,tsfm._readings,wo._fault_records,wo._asset_metadatabefore each test. Hermetic — noSG_DATA_DIRpollution between cases. - No new inline comments, review threads, or third-party/bot findings since follow-up #2.
- Commit chain unchanged from follow-up #2 plus
e8b3ab0+c5067b9:a5b35a9→3fb6943→e8b3ab0→c5067b9. All signed off. - PR body unchanged; design rationale + size/split offer + acknowledgments paragraph still present.
- DCO check stayed
SUCCESSacross both new commits.
Verification at c5067b9
SG_DATA_DIR=… uv run pytest src/servers/smart_grid/ -q→ 42 passed.uv run ruff format --check src/servers/smart_grid/→ 16 files already formatted.uv run ruff check src/servers/smart_grid/→ all checks passed.
Summary counts
- Critical: 0
- High: 0
- Medium: 0
- Low: 0
- Nit: 0 (L1 + L2 both closed)
Verdict
LGTM — final-confirmation clean at head c5067b9. All review findings across the four passes closed in-PR: H1 DCO sign-off (v1 → a5b35a9), H1 Conventional Commits title (v1 → a5b35a9), M1 nested-vs-extended design rationale (v1 → PR body), H1 DGA Timestamp serialization (v2 → 3fb6943), L1 wo._normalize_record latent landmine (v3 → e8b3ab0), L2 missing project-wide JSON-safety smoke (v3 → c5067b9). Remaining gate is purely IBM-maintainer external review.
Summary
Ports the Smart Grid transformer-maintenance domain from the SmartGridBench source project (Columbia University, 2026) into AssetOpsBench as a focused upstream contribution.
This PR adds:
src/servers/smart_grid/for IoT, FMSR/DGA, TSFM/RUL, and work-order workflows.docs/smart_grid_data_provenance.md.SG_DATA_DIRdata-provenance documentation and a no-CSV-port policy, so no raw or processed source-project CSV datasets are shipped in this repository.Design rationale: nested vs. extended
src/servers/smart_grid/{iot,fmsr,tsfm,wo}shares directory names with the existing top-levelsrc/servers/{iot,fmsr,tsfm,wo}servers. The existing servers are domain-general with backend-specific assumptions:src/servers/iot/main.py— CouchDB-backed telemetry (COUCHDB_URL,IOT_DBNAME).src/servers/fmsr/main.py— chillers/AHUs hardcoded list + LLM fallback.src/servers/tsfm/main.py— IBM Granite TSFM, dataset directory viaPATH_TO_DATASETS_DIR.src/servers/wo/main.py— FastMCP server reading fromWO_DATA_DIR.The Smart Grid versions are transformer-dataset-specific with CSV-backed loaders driven by
SG_DATA_DIR, IEC 60599 DGA classification helpers, and transformer asset-type fixtures. Extending the existing servers in place would conflate two different backend/asset-type assumptions inside a single module. Sub-namespacing undersrc/servers/smart_grid/keeps the existing servers untouched and the transformer-specific code locally coherent. Open to splitting these into separate top-level server names (smart_grid_iot, etc.) or merging into the existing servers if maintainers prefer.Why this is scoped narrowly
This is the first maintainer-readable upstream cut. It intentionally excludes the source-project fork's evaluation-adapter parity work, orchestration runners, batch-mode runner changes, generated-scenario expansion beyond the validated 36, benchmark result artifacts, and course planning/report/deck materials. Those can be proposed separately if useful after the domain/scenario port is reviewed.
Size and split offer
This PR is +3102/-3 lines, larger than the "<300 changed lines" preference in
CONTRIBUTING.md. ~950 lines are scenario JSON fixtures (src/scenarios/local/smart_grid.json+smart_grid_negative_checks.json), not code. If a smaller first review is preferred, this can be split into:docs/smart_grid_data_provenance.md+ README touch (data-only, ~1100 lines).src/servers/smart_grid/MCP servers + direct adapter + tests +pyproject.tomlconsole-script entries (~2000 lines).Happy to do that if a maintainer asks.
Data policy
No raw or processed SmartGridBench CSV files are included. The servers read synthetic CSVs from
SG_DATA_DIRat runtime. The provenance doc explains the expected file layout, the generator source, the Smart Grid scenario metadata fields, and why the source-project processed CSVs are not copied into AssetOpsBench.Validation
uv run pytest src/servers/smart_grid/— 25 passed.uv run ruff format --check src/servers/smart_grid/— clean.uv run ruff check src/servers/smart_grid/— clean.src/scenarios/local/smart_grid.jsoncontains 36 unique canonical records.src/scenarios/local/smart_grid_negative_checks.jsoncontains 5 unique negative fixtures.Acknowledgments
Source-project authors (Columbia SmartGridBench, Spring 2026): Akshat Bhandari, Aaron Fan, Tanisha Rathod, Wei Alexander Xin.
References