diff --git a/README.md b/README.md index b4867f425..7ed3b955b 100644 --- a/README.md +++ b/README.md @@ -112,7 +112,7 @@ We are expanding **AssetOpsBench** to cover a broader range of industrial challe AssetOpsBench is a **unified framework for developing, orchestrating, and evaluating domain-specific AI agents** in industrial asset operations and maintenance. It provides: -- 4 **domain-specific agents** +- 4 **domain-specific agents** (IoT, FMSR, TSFM, WO) plus an optional **Smart Grid 7th-domain** add-on covering transformer health, DGA fault classification, RUL forecasting, and work-order management - 2 **multi-agent orchestration frameworks** Designed for **maintenance engineers, reliability specialists, and facility planners**, it allows reproducible evaluation of multi-step workflows in simulated industrial environments. @@ -145,6 +145,9 @@ Explore all scenarios [HF-Dataset](https://huggingface.co/datasets/ibm-research/ - **[MetaAgent](https://github.com/IBM/AssetOpsBench/tree/main/src/meta_agent)**: reAct-based single-agent-as-tool orchestration - **[AgentHive](https://github.com/IBM/AssetOpsBench/tree/main/src/agent_hive)**: plan-and-execute sequential workflow +### Smart Grid 7th-domain add-on +`src/servers/smart_grid/` adds a Smart Grid transformer-operations domain that pairs with the original four general-purpose servers. Same MCP transport contract; Smart-Grid-specific tools cover DGA Rogers Ratio analysis, transformer RUL, fault records, and work-order workflow. Originally developed in the [SmartGridBench source project](https://github.com/HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp) (Columbia University, 2026) and ported here so Smart Grid Bench is a first-class AOB domain. Set `SG_DATA_DIR` to point at the processed CSV directory; scenarios live in `src/scenarios/local/smart_grid.json` (36 records covering single-domain probes and multi-step end-to-end workflows). + ### MCP Environment The `src/` directory contains MCP servers and a plan-execute runner built on the [Model Context Protocol](https://modelcontextprotocol.io/). See **[INSTRUCTIONS.md](./INSTRUCTIONS.md)** for setup, usage, and testing. @@ -336,4 +339,3 @@ Thanks goes to these wonderful people ✨ --- - diff --git a/docs/smart_grid_data_provenance.md b/docs/smart_grid_data_provenance.md new file mode 100644 index 000000000..e58784fdd --- /dev/null +++ b/docs/smart_grid_data_provenance.md @@ -0,0 +1,142 @@ +# Smart Grid Data Provenance + +*Created: 2026-05-01* + +## Overview + +The Smart Grid 7th-domain MCP servers in this package operate over **synthetic +data only**. No proprietary or course-restricted data is shipped with the AOB +codebase. Runtime data location is configured via the `SG_DATA_DIR` +environment variable. + +The source project for this port is +[`HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp`](https://github.com/HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp). + +## What `SG_DATA_DIR` is + +`SG_DATA_DIR` is an environment variable pointing at the directory containing +the synthetic Smart Grid CSV datasets the servers read at runtime. + +**Default path:** `./data/sg_processed/` relative to the current working +directory (wherever the server process is launched from). + +Resolution order in [`src/servers/smart_grid/base.py`](../src/servers/smart_grid/base.py): + +1. `SG_DATA_DIR` environment variable — absolute or cwd-relative path. +2. `./data/sg_processed/` relative to cwd (fallback if the variable is unset). + +The path is not required to exist at import time; existence is enforced on the +first data-loading call, which raises a clear `FileNotFoundError` with +remediation instructions if the path is missing. + +## What's in `SG_DATA_DIR` + +Six synthetic CSV files, one per logical data slice: + +| File | Server(s) | Description | +|---|---|---| +| `asset_metadata.csv` | IoT | Static nameplate data per transformer (`transformer_id`, `name`, `manufacturer`, `location`, `voltage_class`, `rating_kva`, `install_date`, `age_years`, `health_status`, `fdd_category`, `rul_days`, `in_service`) | +| `sensor_readings.csv` | IoT, TSFM | Time-series sensor readings (load current, winding temp, oil temp, voltage) | +| `failure_modes.csv` | FMSR | Failure mode catalogue with severity, IEC code, and recommended action | +| `dga_records.csv` | FMSR | Dissolved Gas Analysis (DGA) records per transformer, per sample date | +| `rul_labels.csv` | TSFM | Remaining-useful-life labels and health index per transformer | +| `fault_records.csv` | WO | Historical fault and maintenance event records | + +All values are synthetic. Gas concentrations in `dga_records.csv` are derived +from the source project's data pipeline; the IEC 60599:2022 Rogers Ratio +fault-table boundaries used for DGA classification are encoded in that +project's `data/knowledge/transformer_standards.json`. + +## No-CSV-port policy + +The source project's processed CSVs (under `data/processed/`) are **not** +copied into AssetOpsBench. Reasons: + +1. **Licensing** — three of the five Kaggle source datasets are CC0; two + (Transformer Health Index — ODbL, and Current & Voltage Monitoring — author + copyright) have redistribution restrictions and are treated as local-only in + the source pipeline. No processed outputs derived from restricted sources are + ported to AOB, which avoids a licensing audit for upstream reviewers. +2. **Reproducibility** — the synthetic data can be regenerated from + `data/generate_synthetic.py` in the source project. Any downstream user with + the generator and the IEC encoding can produce equivalent datasets without + needing the source project's processed CSVs. +3. **AOB cleanliness** — no course-internal preprocessing outputs in the package + simplifies upstream review scope. + +**For a reviewer or downstream user needing Smart Grid data:** + +```bash +git clone https://github.com/HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp.git +cd hpml-assetopsbench-smart-grid-mcp +pip install -r requirements.txt +python data/generate_synthetic.py # produces data/processed/*.csv +export SG_DATA_DIR=$(pwd)/data/processed +``` + +## Scenario schema and identifiers + +`src/scenarios/local/smart_grid.json` follows the AOB local scenario array +convention: each file is a JSON array and each record has an `id`, `type`, +`text`, `category`, and `characteristic_form`. + +Smart Grid records also carry evaluator-facing metadata: + +| Field | Purpose | +|---|---| +| `asset_id` | Transformer identifier used by the scenario, when applicable. | +| `difficulty` | Coarse difficulty label (`easy`, `medium`, or `hard`). | +| `domain_tags` | Smart Grid domains exercised by the prompt. | +| `expected_tools` | Intended tool path, using `iot.*`, `fmsr.*`, `tsfm.*`, and `wo.*` names. | +| `ground_truth` | Lightweight grading hints such as required concepts, thresholds, or intermediate values. | + +These extended fields are advisory metadata for evaluators and are safe for +scenario consumers to ignore if they only need the core AOB prompt fields. + +Identifier prefixes are intentional: + +- `AOB-FMSR-*` records are domain-level catalogue probes that do not depend on a + specific synthetic transformer. +- `SGT-*` records are transformer-grounded Smart Grid task scenarios. +- `SG-NEG-*` records are negative fixtures used to test validation behavior, not + main benchmark prompts. + +## Source datasets + +The source pipeline draws from five Kaggle datasets; licensing varies: + +| Dataset | License | Domain servers | +|---|---|---| +| Power Transformers FDD & RUL | CC0 | IoT, TSFM | +| DGA Fault Classification | CC0 | FMSR | +| Smart Grid Fault Records | CC0 | WO | +| Transformer Health Index | ODbL (redistribution restricted; local-only) | FMSR (supplemental) | +| Current & Voltage Monitoring | Author copyright (redistribution restricted; local-only) | IoT, TSFM (supplemental) | + +Dataset licensing details and row counts are documented in +`docs/hpml_datasets.pdf` in the source project. + +## IEC / IEEE standards encoding + +DGA-related ground truth (fault codes, condition tiers, gas thresholds) is +encoded in `data/knowledge/transformer_standards.json` in the source project. +That artifact reflects: + +- **IEC 60599:2022** (4th ed., publication 66491) — Rogers Ratio method, + fault-table boundaries, representative gas profiles +- **IEEE C57.104-2019** — condition framework (C1–C4) and gas threshold values + +The FMSR server's `analyze_dga` tool implements the Rogers Ratio method +using the fault-table boundaries from that artifact. Note: the AOB fork +server encodes the table directly in `src/servers/smart_grid/fmsr/main.py` +rather than reading the JSON at runtime. Downstream users regenerating DGA +records should verify that generated gas values round-trip correctly through +`analyze_dga` for their intended fault labels before using them as benchmark +ground truth. + +## Citation + +SmartGridBench: A Smart Grid transformer maintenance benchmark for MCP-enabled +LLM agents. Columbia University, 2026. +*Citation will be updated when the NeurIPS 2026 Datasets & Benchmarks +submission is finalized.* diff --git a/pyproject.toml b/pyproject.toml index 89c2ee43b..9836ee8f9 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -40,6 +40,10 @@ fmsr-mcp-server = "servers.fmsr.main:main" tsfm-mcp-server = "servers.tsfm.main:main" wo-mcp-server = "servers.wo.main:main" vibration-mcp-server = "servers.vibration.main:main" +sg-iot-mcp-server = "servers.smart_grid.iot.main:main" +sg-fmsr-mcp-server = "servers.smart_grid.fmsr.main:main" +sg-tsfm-mcp-server = "servers.smart_grid.tsfm.main:main" +sg-wo-mcp-server = "servers.smart_grid.wo.main:main" openai-agent = "agent.openai_agent.cli:main" deep-agent = "agent.deep_agent.cli:main" @@ -76,4 +80,3 @@ norecursedirs = ["src/tmp"] filterwarnings = [ "ignore:Core Pydantic V1 functionality:UserWarning", ] - diff --git a/src/scenarios/local/smart_grid.json b/src/scenarios/local/smart_grid.json new file mode 100644 index 000000000..40fa6270a --- /dev/null +++ b/src/scenarios/local/smart_grid.json @@ -0,0 +1,950 @@ +[ + { + "id": "AOB-FMSR-001", + "type": "FMSR", + "text": "List all known failure modes in the transformer dataset. For each, provide the severity level and recommended maintenance action.", + "category": "Failure Mode Catalog", + "characteristic_form": "The answer should enumerate all failure modes from the catalogue with each entry including at minimum: a name or identifier, a severity level (low/medium/high/critical), and a recommended maintenance action.", + "expected_tools": [ + "fmsr.list_failure_modes" + ], + "ground_truth": { + "must_include": [ + "failure mode", + "severity", + "recommended" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-003", + "type": "FMSR", + "text": "Use the latest DGA record for transformer T-012 to identify the most likely fault mode and explain the top two candidate failure mechanisms.", + "category": "Fault Diagnosis", + "characteristic_form": "The answer should map DGA evidence to a primary fault mode and provide at least one alternative candidate with rationale grounded in gas patterns or known fault mapping.", + "asset_id": "T-012", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga", + "fmsr.search_failure_modes" + ], + "ground_truth": { + "must_include": [ + "primary fault mode", + "secondary candidate", + "evidence-based justification" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-004", + "type": "FMSR", + "text": "Transformer T-018 has a sudden hydrogen increase relative to baseline. Determine the likely root cause class and recommend one confirmatory measurement.", + "category": "Root Cause Analysis", + "characteristic_form": "The answer should classify the likely failure/root-cause class from gas behavior and propose one concrete confirmatory measurement or inspection step.", + "asset_id": "T-018", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga", + "fmsr.search_failure_modes" + ], + "ground_truth": { + "must_include": [ + "root cause class", + "one confirmatory test" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-013", + "type": "FMSR", + "text": "Field technicians on T-003 reported intermittent partial discharge symptoms during their last inspection. Identify the failure mode that best matches this symptom and list the sensor readings most useful for confirming and tracking its progression.", + "category": "Symptom-Driven Failure Mode Lookup", + "characteristic_form": "The answer should identify a specific failure mode matching the symptom, cite the match rationale, and list the sensors most correlated with tracking that mode.", + "asset_id": "T-003", + "expected_tools": [ + "fmsr.search_failure_modes", + "fmsr.get_sensor_correlation" + ], + "ground_truth": { + "must_include": [ + "matched failure mode", + "rationale", + "correlated sensors" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-014", + "type": "FMSR", + "text": "T-014's most recent oil sample has returned elevated gas readings flagged during laboratory review. Diagnose the probable fault mode, identify the closest matching failure mechanism in the catalogue, and specify which sensors should be prioritized for ongoing monitoring.", + "category": "Full DGA Diagnostic Chain", + "characteristic_form": "The answer should state the DGA-derived fault mode, match it to a catalogue failure mechanism with rationale, and recommend a prioritized sensor monitoring list.", + "asset_id": "T-014", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga", + "fmsr.search_failure_modes", + "fmsr.get_sensor_correlation" + ], + "ground_truth": { + "must_include": [ + "fault mode from DGA", + "matched failure mechanism", + "sensors for ongoing monitoring" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-023", + "type": "FMSR", + "text": "Pull the dissolved gas analysis record on file for T-013 and explain what the gas profile suggests about the unit. Identify the dominant gases, then look up which failure mode in the catalogue best matches that gas signature.", + "category": "DGA Record Interpretation", + "characteristic_form": "The answer should report the stored DGA gas concentrations for T-013, identify which gases dominate the profile, and name the matching failure mode from the catalogue with its severity tier and recommended action.", + "asset_id": "T-013", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.search_failure_modes" + ], + "ground_truth": { + "must_include": [ + "DGA gas values", + "dominant gases", + "matching failure mode", + "severity tier", + "recommended action" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-024", + "type": "FMSR", + "text": "Operations is preparing a spare-parts and inspection plan focused on worst-case scenarios. From the failure mode catalogue, surface only the modes flagged at high or critical severity and pair each with its recommended maintenance action.", + "category": "Severity-Filtered Failure Mode Review", + "characteristic_form": "The answer should list every failure mode whose severity is either high or critical, give the descriptive label or IEC code for each, and pair it with the recommended maintenance action drawn from the catalogue.", + "expected_tools": [ + "fmsr.list_failure_modes" + ], + "ground_truth": { + "must_include": [ + "high severity modes", + "critical severity modes", + "recommended action per mode" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "FMSR" + ] + }, + { + "id": "SGT-031", + "type": "FMSR", + "text": "Transformer T-005 shows stable gas levels during routine monitoring. Confirm whether any fault is indicated and recommend the next monitoring step.", + "category": "Fault Diagnosis", + "characteristic_form": "A normal/no-fault confirmation with a routine-monitoring recommendation.", + "asset_id": "T-005", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga" + ], + "domain_tags": [ + "FMSR" + ], + "difficulty": "medium", + "ground_truth": { + "ideal_tool_sequence": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga" + ], + "decisive_intermediate_values": { + "iec_code": "N", + "final_sample_gases_ppm": { + "H2": 12.31, + "CH4": 8.03, + "C2H2": 0.17, + "C2H4": 2.07, + "C2H6": 0.71 + }, + "condition": "normal/background gas profile" + }, + "final_value": { + "primary_fault": "Normal/no active fault", + "recommended_action": "Continue routine monitoring; no immediate inspection beyond the scheduled interval" + }, + "acceptance_criteria": [ + "agent identifies no active fault or an N/normal DGA profile", + "agent recommends continued routine monitoring or equivalent non-urgent follow-up", + "agent uses the background gas profile to justify the non-fault diagnosis" + ], + "must_include": [ + "normal/no active fault", + "recommended monitoring action", + "background gas rationale" + ] + } + }, + { + "id": "SGT-001", + "type": "IoT", + "text": "List all available sensors for transformer T-011 and include each sensor's latest timestamped value.", + "category": "Asset Discovery", + "characteristic_form": "The answer should identify asset T-011, enumerate available sensor names, and provide latest value plus timestamp for each sensor using IoT retrieval tools.", + "asset_id": "T-011", + "expected_tools": [ + "iot.get_asset_metadata", + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "ground_truth": { + "must_include": [ + "T-011", + "sensor list", + "latest value", + "timestamp" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-002", + "type": "IoT", + "text": "For transformer T-014, inspect the last 24 hours of available voltage telemetry and report whether either voltage signal deviated more than 3% from its recent mean.", + "category": "Condition Monitoring", + "characteristic_form": "The answer should compute or reference phase-voltage imbalance over the requested window, state whether any event exceeded 3%, and include when the peak imbalance occurred.", + "asset_id": "T-014", + "expected_tools": [ + "iot.get_sensor_readings" + ], + "ground_truth": { + "threshold_percent": 3, + "must_include": [ + "max deviation", + "time of exceedance or no exceedance", + "clear yes/no decision" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-011", + "type": "IoT", + "text": "Identify all transformers in the fleet currently showing degraded or poor health status. For each flagged asset, retrieve its nameplate data and summarize which units require prioritized attention.", + "category": "Fleet Health Assessment", + "characteristic_form": "The answer should enumerate degraded assets by ID, include nameplate metadata for each, and rank or flag the units most in need of follow-up.", + "expected_tools": [ + "iot.list_assets", + "iot.get_asset_metadata" + ], + "ground_truth": { + "must_include": [ + "list of degraded asset IDs", + "nameplate data per asset", + "prioritization or ranking" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-012", + "type": "IoT", + "text": "T-005 has been logging elevated load current readings over the past week. Retrieve the recent sensor data and determine whether current readings are abnormally high relative to the recent baseline, and flag the peak recorded value and when it occurred.", + "category": "Load Current Monitoring", + "characteristic_form": "The answer should retrieve load current readings for T-005, compute or reference a recent baseline, state whether readings are elevated above that baseline, and report the peak value and timestamp.", + "asset_id": "T-005", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "ground_truth": { + "must_include": [ + "load current readings", + "baseline or mean reference", + "peak value and timestamp", + "elevated vs normal assessment" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-021", + "type": "IoT", + "text": "T-007 is one of the high-voltage transmission units in the fleet. List its asset metadata and the full set of sensors currently installed, and flag whether the available instrumentation covers both electrical and thermal monitoring needs.", + "category": "Sensor Coverage Audit", + "characteristic_form": "The answer should report the transformer's voltage class and rating, list all installed sensor IDs, and summarize whether thermal channels (e.g. winding/oil temperature) and electrical channels (e.g. load current, voltage) are both represented.", + "asset_id": "T-007", + "expected_tools": [ + "iot.get_asset_metadata", + "iot.list_sensors" + ], + "ground_truth": { + "must_include": [ + "voltage class", + "rating", + "sensor list", + "thermal sensor coverage", + "electrical sensor coverage" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-022", + "type": "IoT", + "text": "Provide a quick fleet-status snapshot. Enumerate the transformers currently in service, then for one representative healthy unit (T-005) and one degraded unit (T-013) surface the available sensor channels and a recent telemetry sample for each channel so a comparative read of operational state can be made.", + "category": "Fleet Status Snapshot", + "characteristic_form": "The answer should list the in-service transformer IDs, then for each of T-005 and T-013 surface the available sensor channels and a recent telemetry sample for each channel, and call out any qualitative difference between the two units.", + "expected_tools": [ + "iot.list_assets", + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "ground_truth": { + "must_include": [ + "in-service transformer list", + "T-005 sensor channels", + "T-005 sample readings", + "T-013 sensor channels", + "T-013 sample readings", + "comparative observation" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "IoT" + ] + }, + { + "id": "SGT-032", + "type": "IoT", + "text": "Transformer T-005 is operating at 98% capacity with no spare available. Retrieve the recent oil-temperature readings, report the recent baseline (mean) and peak value with timestamp, and state whether the peak is elevated relative to the baseline.", + "category": "Sensor Analysis", + "characteristic_form": "The answer should retrieve oil_temp_c readings for T-005 from iot.get_sensor_readings, compute or reference a recent baseline (mean), report the peak value and its timestamp, and assess whether the peak is elevated relative to baseline \u2014 all from actual tool output rather than a pre-stated absolute threshold.", + "asset_id": "T-005", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "domain_tags": [ + "IoT" + ], + "difficulty": "easy", + "ground_truth": { + "ideal_tool_sequence": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "decisive_intermediate_values": { + "sensor_id": "oil_temp_c" + }, + "must_include": [ + "oil_temp_c readings", + "baseline or mean reference", + "peak value and timestamp", + "elevated vs normal assessment" + ] + } + }, + { + "id": "SGT-033", + "type": "IoT", + "text": "Transformer T-005 is operating near peak load. Retrieve the recent winding-temperature readings, identify the peak value with timestamp, and determine whether it exceeds the documented 95\u00b0C operating limit (per docs/knowledge/scenario_generation_support.json).", + "category": "Sensor Analysis", + "characteristic_form": "The answer should retrieve winding_temp_top_c readings for T-005 from iot.get_sensor_readings, report the peak value with its timestamp, compare against the 95\u00b0C operating limit named in the prompt, and state whether the limit was breached.", + "asset_id": "T-005", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "domain_tags": [ + "IoT" + ], + "difficulty": "easy", + "ground_truth": { + "ideal_tool_sequence": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "decisive_intermediate_values": { + "sensor_id": "winding_temp_top_c", + "operating_limit_c": 95, + "operating_limit_source": "docs/knowledge/scenario_generation_support.json -> alarm_correlation_examples (winding_temp_sensor)" + }, + "must_include": [ + "winding_temp_top_c peak value and timestamp", + "comparison against 95\u00b0C operating limit", + "limit-breach status" + ] + } + }, + { + "id": "SGT-035", + "type": "IoT", + "text": "Transformer T-004 is operating near peak capacity. Retrieve the recent oil-temperature readings and assess the current operating range \u2014 report the minimum, mean, and maximum oil_temp_c observed in the recent window with timestamps for the extremes, and characterize the operating profile.", + "category": "Sensor Analysis", + "characteristic_form": "The answer should retrieve oil_temp_c readings for T-004 from iot.get_sensor_readings and report the min/mean/max with timestamps for the extremes, characterizing the recent operating range \u2014 all from actual tool output rather than a pre-stated absolute threshold.", + "asset_id": "T-004", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "domain_tags": [ + "IoT" + ], + "difficulty": "easy", + "ground_truth": { + "ideal_tool_sequence": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "decisive_intermediate_values": { + "sensor_id": "oil_temp_c" + }, + "must_include": [ + "oil_temp_c min, mean, and max values with units", + "timestamps for the min and max readings", + "operating-range characterization" + ] + } + }, + { + "id": "SGT-009", + "type": "Multi", + "text": "Transformer T-015 shows rising load and intermittent over-temperature alerts. Investigate recent sensor behavior, infer probable fault mode, estimate short-term risk over 30 days, and issue a maintenance work order recommendation.", + "category": "End-to-End Incident Response", + "characteristic_form": "The answer should show a multi-step workflow: summarize relevant IoT evidence, map to likely fault mode, provide a 30-day risk or forecast statement, and conclude with a concrete work-order recommendation.", + "asset_id": "T-015", + "expected_tools": [ + "iot.get_sensor_readings", + "fmsr.analyze_dga", + "tsfm.forecast_rul", + "wo.create_work_order" + ], + "ground_truth": { + "must_include": [ + "sensor evidence", + "fault hypothesis", + "30-day risk outlook", + "work-order recommendation" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "IoT", + "FMSR", + "TSFM", + "WO" + ] + }, + { + "id": "SGT-010", + "type": "Multi", + "text": "For transformer T-020, combine DGA diagnostics with thermal anomaly trends to decide whether to schedule immediate outage maintenance or defer with heightened monitoring.", + "category": "Cross-Domain Planning", + "characteristic_form": "The answer should integrate DGA-based diagnosis and trend/anomaly evidence, make a binary maintenance decision (immediate outage vs deferred monitoring), and provide an actionable work-order or monitoring plan.", + "asset_id": "T-020", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga", + "tsfm.detect_anomalies", + "wo.create_work_order" + ], + "ground_truth": { + "must_include": [ + "diagnostic evidence", + "trend evidence", + "final decision", + "action plan" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "FMSR", + "TSFM", + "WO" + ] + }, + { + "id": "SGT-016", + "type": "Multi", + "text": "T-001 is a critical substation transformer with no spare unit available. Assess its remaining useful life, identify which sensors are available and check them for anomalies over the past 30 days, and determine whether the current condition supports continued operation through the next planned 90-day maintenance window.", + "category": "Comprehensive Health Assessment", + "characteristic_form": "The answer should retrieve available sensors, report an RUL estimate, identify any anomalies in the past 30 days, and give a clear go/no-go recommendation for the 90-day operating window, noting the no-spare constraint.", + "asset_id": "T-001", + "expected_tools": [ + "iot.list_sensors", + "tsfm.get_rul", + "tsfm.forecast_rul", + "tsfm.detect_anomalies", + "tsfm.trend_analysis" + ], + "ground_truth": { + "must_include": [ + "sensor list retrieved", + "RUL estimate", + "anomaly findings", + "90-day go/no-go recommendation", + "no-spare constraint acknowledged" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "IoT", + "TSFM" + ] + }, + { + "id": "SGT-019", + "type": "Multi", + "text": "T-004 has been carrying near-peak load for several weeks. Identify the available sensors, review recent telemetry and any thermal trends, and assess whether the current condition poses a risk to sustained operation over the next 90 days.", + "category": "Thermal Risk Assessment", + "characteristic_form": "The answer should list available sensors for T-004, summarize relevant IoT sensor readings, report the thermal trend direction and rate, flag any anomalies, and give a 90-day operational risk assessment.", + "asset_id": "T-004", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings", + "tsfm.trend_analysis", + "tsfm.detect_anomalies", + "tsfm.forecast_rul" + ], + "ground_truth": { + "must_include": [ + "sensor list retrieved", + "sensor telemetry summary", + "thermal trend finding", + "anomaly findings", + "90-day operational risk assessment" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "IoT", + "TSFM" + ] + }, + { + "id": "SGT-020", + "type": "Multi", + "text": "T-016 tripped on differential protection overnight. Review its DGA data and fault record history to determine what the available evidence supports, then issue a corrective work order with a priority level appropriate to the overall situation.", + "category": "Protection Trip Response", + "characteristic_form": "The answer should retrieve and analyze DGA data (noting whether the result is diagnostic or inconclusive), reference prior fault history, and produce a corrective work order with a priority level justified by the combined evidence from the protection trip event, fault history, and component health.", + "asset_id": "T-016", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.analyze_dga", + "wo.list_fault_records", + "wo.create_work_order" + ], + "ground_truth": { + "must_include": [ + "DGA retrieved and result noted", + "fault history cited", + "corrective work order issued", + "WO priority justified by protection trip and fault history" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "FMSR", + "WO" + ] + }, + { + "id": "SGT-025", + "type": "Multi", + "text": "Operators flagged a short window of unusual top-of-winding temperature readings on T-014 over the last few days. First identify the sensor channels installed on this asset, then run an anomaly check and trend analysis on the winding temperature channel (winding_temp_top_c) to determine whether the deviations form a single burst or a sustained shift.", + "category": "Anomaly Burst vs Trend Disambiguation", + "characteristic_form": "The answer should first surface the available sensor channels for T-014, then for the winding_temp_top_c channel report the anomaly findings and recent trend direction, and conclude whether the disturbance is an isolated burst or a sustained drift in the underlying signal.", + "asset_id": "T-014", + "expected_tools": [ + "iot.list_sensors", + "tsfm.detect_anomalies", + "tsfm.trend_analysis" + ], + "ground_truth": { + "must_include": [ + "sensor channels available", + "winding_temp_top_c anomaly findings", + "winding_temp_top_c trend direction", + "burst vs sustained classification" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "IoT", + "TSFM" + ] + }, + { + "id": "SGT-029", + "type": "Multi", + "text": "T-019 has just had a fresh oil sample analysed and the DGA record is now on file. Review that gas profile, identify the most likely failure mode it points to, and convert that finding into a corrective inspection work order whose priority reflects the implied severity.", + "category": "DGA Finding to Inspection Work Order", + "characteristic_form": "The answer should report the stored DGA values for T-019, name the matching failure mode and its severity tier, and create a work order whose priority and scope are explicitly justified by the DGA-derived severity.", + "asset_id": "T-019", + "expected_tools": [ + "fmsr.get_dga_record", + "fmsr.search_failure_modes", + "wo.create_work_order" + ], + "ground_truth": { + "must_include": [ + "DGA gas values", + "matching failure mode", + "severity tier", + "work order created", + "priority justification" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "FMSR", + "WO" + ] + }, + { + "id": "SGT-030", + "type": "Multi", + "text": "End-of-quarter fleet review has identified T-020 (critical-tier) as a candidate for accelerated maintenance. Pull the RUL forecast over a 90-day horizon, then create a work order whose scope and priority are explicitly justified by both the current RUL and the projected RUL at the end of that horizon.", + "category": "RUL Forecast-Driven Work Order", + "characteristic_form": "The answer should report the RUL forecast for T-020 \u2014 including the current RUL and the projected RUL at the 90-day horizon \u2014 and create a work order whose priority and scope are explicitly justified by the magnitude of that change (e.g. emergency vs short-term corrective).", + "asset_id": "T-020", + "expected_tools": [ + "tsfm.forecast_rul", + "wo.create_work_order" + ], + "ground_truth": { + "must_include": [ + "current RUL value", + "projected RUL at 90-day horizon", + "work order created", + "priority justification" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "TSFM", + "WO" + ] + }, + { + "id": "SGT-005", + "type": "TSFM", + "text": "Forecast remaining useful life for transformer T-016 and state whether it can safely operate for the next 180 days without major maintenance.", + "category": "Remaining Useful Life", + "characteristic_form": "The answer should report an RUL estimate in days, compare it against the 180-day horizon, and provide a clear maintenance recommendation.", + "asset_id": "T-016", + "expected_tools": [ + "tsfm.get_rul", + "tsfm.forecast_rul" + ], + "ground_truth": { + "decision_window_days": 180, + "must_include": [ + "rul_days_estimate", + "go/no-go recommendation" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "TSFM" + ] + }, + { + "id": "SGT-006", + "type": "TSFM", + "text": "Run anomaly detection on transformer T-019 winding temperature for the past 7 days and summarize any persistent abnormal periods.", + "category": "Anomaly Detection", + "characteristic_form": "The answer should identify abnormal intervals (if any), quantify severity or count of anomalies, and provide a concise interpretation of persistence.", + "asset_id": "T-019", + "expected_tools": [ + "tsfm.detect_anomalies" + ], + "ground_truth": { + "must_include": [ + "anomaly window(s)", + "severity or count", + "summary interpretation" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "TSFM" + ] + }, + { + "id": "SGT-015", + "type": "TSFM", + "text": "T-002's winding temperature sensor (winding_temp_top_c) has shown irregular readings over the past month. Analyze the recent trend and report whether readings are increasing, decreasing, or stable, along with the rate of change and how well the trend fits the data.", + "category": "Sensor Trend Analysis", + "characteristic_form": "The answer should report the trend direction (increasing, decreasing, or stable), the slope rate per day, and the R-squared fit quality for T-002's winding_temp_top_c sensor.", + "asset_id": "T-002", + "expected_tools": [ + "tsfm.trend_analysis" + ], + "ground_truth": { + "must_include": [ + "trend direction", + "slope rate", + "R-squared or fit quality" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "TSFM" + ] + }, + { + "id": "SGT-026", + "type": "TSFM", + "text": "Capital planning needs a defensible RUL view for T-011 ahead of the next budget cycle. Pull the current RUL estimate and a 180-day forecast, compare the two, and state whether the comparison supports deferring replacement or argues for advancing it.", + "category": "RUL Current vs Forecast Comparison", + "characteristic_form": "The answer should report the current RUL estimate for T-011, the forecasted RUL at the 180-day horizon, and a recommendation on whether the comparison supports deferring or advancing replacement, citing the magnitude of any change.", + "asset_id": "T-011", + "expected_tools": [ + "tsfm.get_rul", + "tsfm.forecast_rul" + ], + "ground_truth": { + "must_include": [ + "current RUL value", + "forecast RUL value", + "comparison", + "deferral or advance recommendation" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "TSFM" + ] + }, + { + "id": "SGT-007", + "type": "WO", + "text": "Create a preventive inspection work order for transformer T-013 for the next maintenance cycle, including priority, required skills, and estimated downtime.", + "category": "Work Order Generation", + "characteristic_form": "The answer should produce a structured work order containing asset ID, task summary, priority, required technician skill set, and estimated downtime.", + "asset_id": "T-013", + "expected_tools": [ + "wo.create_work_order", + "wo.estimate_downtime" + ], + "ground_truth": { + "must_include": [ + "work order payload", + "priority", + "required skills", + "downtime estimate" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-008", + "type": "WO", + "text": "Given repeated fault indicators on transformer T-017, generate a corrective work order with emergency priority if safety risk is high; otherwise set high priority and justify.", + "category": "Maintenance Decision", + "characteristic_form": "The answer should include a risk-aware priority decision (emergency or high), rationale tied to fault indicators, and a complete corrective work order description.", + "asset_id": "T-017", + "expected_tools": [ + "wo.create_work_order", + "wo.estimate_downtime" + ], + "ground_truth": { + "must_include": [ + "priority decision", + "justification", + "corrective tasks" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-017", + "type": "WO", + "text": "T-009 is approaching its next scheduled maintenance cycle. Review its recent fault history and any open work orders to determine whether anything requires priority escalation before the maintenance window begins.", + "category": "Pre-Maintenance Review", + "characteristic_form": "The answer should list recent fault records for T-009, summarize open work orders, and state whether any item requires escalation with a brief justification.", + "asset_id": "T-009", + "expected_tools": [ + "wo.list_fault_records", + "wo.list_work_orders" + ], + "ground_truth": { + "must_include": [ + "fault record summary", + "open work order summary", + "escalation decision with justification" + ] + }, + "difficulty": "easy", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-018", + "type": "WO", + "text": "A fault event was recently logged on T-006. Retrieve the fault details, estimate the expected repair downtime, open a corrective work order, and then mark it as in-progress with the estimated duration recorded.", + "category": "Work Order Lifecycle", + "characteristic_form": "The answer should retrieve the fault record, estimate downtime, create a corrective work order, and update its status to in-progress with the downtime noted.", + "asset_id": "T-006", + "expected_tools": [ + "wo.list_fault_records", + "wo.get_fault_record", + "wo.estimate_downtime", + "wo.create_work_order", + "wo.update_work_order" + ], + "ground_truth": { + "must_include": [ + "fault record details", + "downtime estimate", + "corrective work order created", + "work order updated to in-progress" + ] + }, + "difficulty": "hard", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-027", + "type": "WO", + "text": "An inspection on T-009 has just been completed in the field with no abnormal condition observed. Open a work order documenting that the inspection took place, then close it out with a note recording the clean inspection result.", + "category": "Work Order Lifecycle (Open and Close)", + "characteristic_form": "The answer should report the newly created work order identifier and its initial status, then confirm a follow-up update that transitions the work order to a closed state with a closure note referencing the clean inspection finding.", + "asset_id": "T-009", + "expected_tools": [ + "wo.create_work_order", + "wo.update_work_order" + ], + "ground_truth": { + "must_include": [ + "work order identifier", + "initial open status", + "status updated to closed", + "closure note recorded" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-028", + "type": "WO", + "text": "Planning needs a downtime estimate for T-018 ahead of an upcoming high-severity corrective outage. Review the recent fault history to characterize the typical event class for this unit, then produce a downtime estimate at the high severity tier with assumptions justified by that history.", + "category": "Downtime Estimation for Planned Outage", + "characteristic_form": "The answer should summarize the recent fault record entries for T-018, identify the dominant fault class or repair pattern, and report a downtime estimate in hours at the high severity tier along with the assumptions that justify it.", + "asset_id": "T-018", + "expected_tools": [ + "wo.list_fault_records", + "wo.estimate_downtime" + ], + "ground_truth": { + "must_include": [ + "fault history summary", + "dominant fault class", + "downtime estimate hours", + "high severity tier", + "estimation assumptions" + ] + }, + "difficulty": "medium", + "domain_tags": [ + "WO" + ] + }, + { + "id": "SGT-034", + "type": "WO", + "text": "T-003 substation main transformer temperature has exceeded threshold (severity: high) with spare procurement pending. Decide on maintenance and create a corrective work order, using historical fault context for the downtime estimate.", + "category": "Work Order Creation", + "characteristic_form": "A decision on work order creation with a derived priority and minimum playbook fields, using fault history to estimate downtime under a high-severity precondition.", + "asset_id": "T-003", + "expected_tools": [ + "wo.list_fault_records", + "wo.estimate_downtime", + "wo.create_work_order" + ], + "domain_tags": [ + "WO" + ], + "difficulty": "medium", + "ground_truth": { + "ideal_tool_sequence": [ + "wo.list_fault_records", + "wo.estimate_downtime", + "wo.create_work_order" + ], + "decisive_intermediate_values": { + "fault_type": "thermal", + "spare_status": "on_order", + "severity": "high" + }, + "final_value": { + "work_order_type": "corrective", + "priority": "high", + "minimum_fields_from_playbook": true + }, + "acceptance_criteria": [ + "agent looks up T-003's historical fault records before estimating downtime", + "agent passes the prompt-stated severity ('high') into wo.estimate_downtime rather than guessing", + "agent creates a corrective work order at high priority with the minimum playbook fields" + ], + "must_include": [ + "work_order_type", + "priority", + "minimum_fields_from_playbook" + ] + } + } +] diff --git a/src/scenarios/local/smart_grid_negative_checks.json b/src/scenarios/local/smart_grid_negative_checks.json new file mode 100644 index 000000000..21e1150a3 --- /dev/null +++ b/src/scenarios/local/smart_grid_negative_checks.json @@ -0,0 +1,84 @@ +[ + { + "id": "SG-NEG-003", + "type": "IoT", + "domain_tags": [ + "IoT" + ], + "category": "single-hop", + "text": "Invalid fixture: single-domain IoT scenario must not request an FMSR tool.", + "characteristic_form": "This fixture should fail because expected_tools includes a cross-domain tool.", + "asset_id": "T-001", + "expected_tools": [ + "iot.list_sensors", + "fmsr.analyze_dga" + ], + "success_criteria": [ + "validator rejects cross-domain tool usage" + ] + }, + { + "id": "SG-NEG-001", + "type": "Multi", + "text": "Negative fixture: declared multi-domain scenario with mismatched tool coverage.", + "category": "Negative Validation Check", + "characteristic_form": "Should fail validator because domain tags and expected tools do not agree.", + "asset_id": "T-001", + "expected_tools": [ + "iot.get_sensor_readings" + ], + "domain_tags": [ + "IoT", + "TSFM" + ] + }, + { + "id": "SG-NEG-005", + "type": "Multi", + "domain_tags": [ + "IoT" + ], + "category": "multi-hop", + "text": "Invalid fixture: Multi scenario must cover at least two domains.", + "characteristic_form": "This fixture should fail because Multi scenarios need multiple domain tags.", + "asset_id": "T-001", + "expected_tools": [ + "iot.list_sensors", + "iot.get_sensor_readings" + ], + "success_criteria": [ + "validator rejects Multi scenarios with only one domain tag" + ] + }, + { + "id": "SG-NEG-004", + "type": "FMSR", + "domain_tags": [ + "IoT" + ], + "category": "single-hop", + "text": "Invalid fixture: FMSR scenario must use matching domain_tags.", + "characteristic_form": "This fixture should fail because type and domain_tags disagree.", + "asset_id": "T-001", + "expected_tools": [ + "fmsr.get_dga_record" + ], + "success_criteria": [ + "validator rejects mismatched type and domain_tags" + ] + }, + { + "id": "SG-NEG-002", + "type": "WO", + "text": "Negative fixture: references a nonexistent WO tool.", + "category": "Negative Validation Check", + "characteristic_form": "Should fail validator because expected_tools contains a non-canonical tool name.", + "asset_id": "T-001", + "expected_tools": [ + "wo.generate_work_order" + ], + "domain_tags": [ + "WO" + ] + } +] diff --git a/src/servers/smart_grid/__init__.py b/src/servers/smart_grid/__init__.py new file mode 100644 index 000000000..c62a03476 --- /dev/null +++ b/src/servers/smart_grid/__init__.py @@ -0,0 +1,18 @@ +"""Smart Grid 7th-domain MCP servers. + +This namespace hosts four MCP servers that surface a Smart Grid transformer +operations dataset to AOB agents: + +- :mod:`servers.smart_grid.iot` — sensor telemetry and asset metadata +- :mod:`servers.smart_grid.fmsr` — failure-mode-to-sensor relations + DGA analysis +- :mod:`servers.smart_grid.tsfm` — time-series forecasting (RUL + anomaly detection) +- :mod:`servers.smart_grid.wo` — work-order creation and management + +The shared CSV-loading helpers live in :mod:`servers.smart_grid.base`. Set +``SG_DATA_DIR`` in the environment to point at the directory holding the +processed CSVs (defaults to ``data/sg_processed/`` relative to cwd). + +Originally developed in the SmartGridBench source project +(``HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp``) and extracted into +AOB to make Smart Grid Bench a first-class AOB domain. +""" diff --git a/src/servers/smart_grid/base.py b/src/servers/smart_grid/base.py new file mode 100644 index 000000000..467bb0ccd --- /dev/null +++ b/src/servers/smart_grid/base.py @@ -0,0 +1,197 @@ +"""Shared data-loading helpers for the Smart Grid MCP servers. + +All four Smart Grid servers (iot, fmsr, tsfm, wo) read processed CSVs from a +common directory. The directory is configurable via the ``SG_DATA_DIR`` +environment variable, with a default of ``./data/sg_processed/`` relative to +the current working directory. + +Dataset → server mapping: + Power Transformers FDD & RUL → IoT, TSFM + DGA Fault Classification → FMSR + Smart Grid Fault Records → WO + Transformer Health Index → FMSR (supplemental) + Current & Voltage Monitoring → IoT, TSFM (supplemental) + +The expected CSV layout is documented in each loader's docstring below. The +source-project data pipeline that produces these CSVs lives under ``data/`` in +``HPML6998-S26-Team13/hpml-assetopsbench-smart-grid-mcp``. +""" + +from __future__ import annotations + +import os +from pathlib import Path + +import pandas as pd + + +def _resolve_data_dir() -> Path: + """Return the configured Smart Grid data directory. + + Resolution order: + 1. ``SG_DATA_DIR`` environment variable (absolute or cwd-relative). + 2. ``./data/sg_processed/`` relative to current working directory. + + The path is *not* required to exist at import time. Existence is enforced + on first ``load_*`` call by :func:`_require`. + """ + env_path = os.environ.get("SG_DATA_DIR") + if env_path: + return Path(env_path).expanduser().resolve() + return Path.cwd() / "data" / "sg_processed" + + +def _data_dir() -> Path: + """Resolve ``SG_DATA_DIR`` lazily so env-var changes mid-process take effect.""" + return _resolve_data_dir() + + +# --------------------------------------------------------------------------- +# IoT domain +# --------------------------------------------------------------------------- + + +def load_asset_metadata() -> pd.DataFrame: + """Load static asset metadata. + + Source CSV: ``$SG_DATA_DIR/asset_metadata.csv`` + Synthesized from: Power Transformers FDD & RUL dataset. + """ + path = _data_dir() / "asset_metadata.csv" + _require(path) + return pd.read_csv(path) + + +def load_sensor_readings() -> pd.DataFrame: + """Load time-series sensor readings indexed by (transformer_id, timestamp). + + Source CSV: ``$SG_DATA_DIR/sensor_readings.csv`` + Synthesized from: Power Transformers FDD & RUL + Current & Voltage + Monitoring datasets. + + Expected columns: + transformer_id, timestamp, sensor_id, value, unit, source + """ + path = _data_dir() / "sensor_readings.csv" + _require(path) + return pd.read_csv(path, parse_dates=["timestamp"]) + + +# --------------------------------------------------------------------------- +# FMSR domain +# --------------------------------------------------------------------------- + + +def load_failure_modes() -> pd.DataFrame: + """Load failure mode descriptions and their associated sensor signatures. + + Source CSV: ``$SG_DATA_DIR/failure_modes.csv`` + Synthesized from: DGA Fault Classification + Transformer Health Index. + + Expected columns: + failure_mode_id, name, dga_label, description, severity, iec_code, + key_gases, recommended_action + """ + path = _data_dir() / "failure_modes.csv" + _require(path) + return pd.read_csv(path) + + +def load_dga_records() -> pd.DataFrame: + """Load dissolved gas analysis (DGA) records used for fault classification. + + Source CSV: ``$SG_DATA_DIR/dga_records.csv`` + Synthesized from: DGA Fault Classification dataset. + + Expected columns: + transformer_id, sample_date, dissolved_h2_ppm, dissolved_ch4_ppm, + dissolved_c2h2_ppm, dissolved_c2h4_ppm, dissolved_c2h6_ppm, + dissolved_co_ppm, dissolved_co2_ppm, fault_label, source_dataset + """ + path = _data_dir() / "dga_records.csv" + _require(path) + return pd.read_csv(path, parse_dates=["sample_date"]) + + +# --------------------------------------------------------------------------- +# TSFM domain +# --------------------------------------------------------------------------- + + +def load_rul_labels() -> pd.DataFrame: + """Load remaining-useful-life (RUL) ground-truth labels per transformer. + + Source CSV: ``$SG_DATA_DIR/rul_labels.csv`` + Synthesized from: Power Transformers FDD & RUL dataset. + + Expected columns: + transformer_id, timestamp, rul_days, health_index, fdd_category + """ + path = _data_dir() / "rul_labels.csv" + _require(path) + return pd.read_csv(path, parse_dates=["timestamp"]) + + +# --------------------------------------------------------------------------- +# WO domain +# --------------------------------------------------------------------------- + + +def load_fault_records() -> pd.DataFrame: + """Load historical fault / maintenance event records. + + Source CSV: ``$SG_DATA_DIR/fault_records.csv`` + Synthesized from: Smart Grid Fault Records dataset. + + Expected columns: + transformer_id, fault_id, fault_type, location, voltage_v, current_a, + power_load_mw, temperature_c, wind_speed_kmh, weather_condition, + maintenance_status, component_health, duration_hrs, downtime_hrs + """ + path = _data_dir() / "fault_records.csv" + _require(path) + return pd.read_csv(path) + + +# --------------------------------------------------------------------------- +# Boundary helpers +# --------------------------------------------------------------------------- + + +def json_safe_record(record: dict) -> dict: + """Normalize a row dict so it serializes under ``json.dumps(allow_nan=False)``. + + Smart Grid tools commonly return a single CSV row via ``df.iloc[0].to_dict()`` + or a list of records via ``df.to_dict(orient="records")``. Both paths can + leak ``pandas.Timestamp`` (from ``parse_dates``) and NaN floats through to + the MCP JSON-RPC response, where the strict serializer rejects them. + + This helper is the canonical boundary normalizer: NaN-likes (including + ``pd.NaT``) become ``None``, ``pandas.Timestamp`` becomes its ISO date + string, all other values pass through unchanged. + """ + normalized: dict = {} + for key, value in record.items(): + if pd.isna(value): + normalized[key] = None + elif isinstance(value, pd.Timestamp): + normalized[key] = value.date().isoformat() + else: + normalized[key] = value + return normalized + + +# --------------------------------------------------------------------------- +# Internal helpers +# --------------------------------------------------------------------------- + + +def _require(path: Path) -> None: + """Raise a clear error if a processed data file isn't present.""" + if not path.exists(): + raise FileNotFoundError( + f"Smart Grid processed data file not found: {path}\n" + "Set SG_DATA_DIR to the directory holding the processed CSVs, " + "or run the team-side data pipeline (HPML Smart Grid MCP repo " + "data/ tree) and copy the outputs to the configured location." + ) diff --git a/src/servers/smart_grid/direct_adapter.py b/src/servers/smart_grid/direct_adapter.py new file mode 100644 index 000000000..d9f1e9072 --- /dev/null +++ b/src/servers/smart_grid/direct_adapter.py @@ -0,0 +1,221 @@ +"""In-process direct-call adapter for the Smart Grid MCP servers. + +Purpose +------- +Smart Grid Bench Experiment 1 measures the latency overhead of the MCP +JSON-RPC transport layer by running the same ReAct agent through three cells +that share the same tool set: + +- **Cell A (Direct)** — ReAct calls the plain Python function in-process. Zero + transport overhead. This module is that entry point. +- **Cell B (MCP baseline)** — ReAct speaks MCP JSON-RPC over stdio to the + server processes. Transport cost = (B − A). +- **Cell C (MCP optimized)** — same as B with batching / connection reuse. + +Using the same underlying functions across all three cells keeps the +comparison honest: any delta is transport, not algorithmic. + +What this module does NOT do +---------------------------- +- No ReAct loop. That's the Cell A runner's job. +- No serialization layer. Arguments go in as Python types, results come out + as Python types. +- No schema validation. The MCP path already validates via :class:`FastMCP`. +- No logging. Latency instrumentation lives in the capture wrappers used by + the runner so Cell A / B / C share it. + +Layout +------ +:func:`get_tools` — ordered list of :class:`ToolSpec` entries, one per tool. +:func:`get_tool` — ``ToolSpec`` lookup by qualified name like + ``iot.get_sensor_readings``. +:func:`list_tool_specs_for_llm` — compact JSON-schema-ish list suitable for + prompting an LLM (name, description, parameters). + +The canonical tool set matches the four ``@mcp.tool()``-decorated functions +per server in :mod:`servers.smart_grid.{iot,fmsr,tsfm,wo}.main`. +""" + +from __future__ import annotations + +import dataclasses +import inspect +import typing +from collections.abc import Callable +from typing import Any + + +@dataclasses.dataclass(frozen=True) +class ToolSpec: + """A single tool in the direct-call registry.""" + + name: str # qualified name, e.g. "iot.get_sensor_readings" + domain: str # "iot" | "fmsr" | "tsfm" | "wo" + bare_name: str # "get_sensor_readings" + fn: Callable[..., Any] # the underlying Python function + doc: str # description extracted from the function's docstring + + def __call__(self, *args, **kwargs): + return self.fn(*args, **kwargs) + + def parameters(self) -> dict[str, dict[str, Any]]: + """Return a minimal JSON-schema-ish parameter spec for LLM prompting.""" + sig = inspect.signature(self.fn) + params: dict[str, dict[str, Any]] = {} + for pname, p in sig.parameters.items(): + if p.kind in ( + inspect.Parameter.VAR_POSITIONAL, + inspect.Parameter.VAR_KEYWORD, + ): + continue + entry: dict[str, Any] = {} + if p.annotation is not inspect.Parameter.empty: + entry["type"] = _type_to_json_name(p.annotation) + if p.default is not inspect.Parameter.empty: + entry["default"] = _safe_json_value(p.default) + else: + entry["required"] = True + params[pname] = entry + return params + + +def _type_to_json_name(tp: Any) -> str: + """Best-effort conversion from Python type hints to a JSON schema type.""" + origin = typing.get_origin(tp) + if origin is typing.Union: + args = [a for a in typing.get_args(tp) if a is not type(None)] + if len(args) == 1: + return _type_to_json_name(args[0]) + return "any" + if tp is int: + return "integer" + if tp is float: + return "number" + if tp is bool: + return "boolean" + if tp is str: + return "string" + if tp in (list, dict): + return tp.__name__ + if origin in (list, dict): + return origin.__name__ + return "any" + + +def _safe_json_value(value: Any) -> Any: + if value is None or isinstance(value, (str, int, float, bool)): + return value + return str(value) + + +def _extract_doc(fn: Callable[..., Any]) -> str: + doc = inspect.getdoc(fn) or "" + # First paragraph only — keeps the prompt tight. + for chunk in doc.split("\n\n"): + chunk = chunk.strip() + if chunk: + return chunk + return "" + + +def _build_registry() -> tuple[list[ToolSpec], dict[str, ToolSpec]]: + """Import each Smart Grid server module and collect its + ``@mcp.tool()``-decorated functions into the ToolSpec registry. + + FastMCP's ``@mcp.tool()`` decorator preserves the underlying Python + function — module-level names remain callable in-process — so we just + import them and call them directly. The ``mcp`` package itself still + needs to be importable (for ``FastMCP()`` at module load); callers that + run outside the serving venv should skip this module. + """ + from servers.smart_grid.fmsr import main as fmsr + from servers.smart_grid.iot import main as iot + from servers.smart_grid.tsfm import main as tsfm + from servers.smart_grid.wo import main as wo + + # Canonical tool set per domain, in the same order as the server files. + # Keep this list in sync when servers add/remove tools. + catalog: list[tuple[str, str, Callable[..., Any]]] = [ + ("iot", "list_assets", iot.list_assets), + ("iot", "get_asset_metadata", iot.get_asset_metadata), + ("iot", "list_sensors", iot.list_sensors), + ("iot", "get_sensor_readings", iot.get_sensor_readings), + ("fmsr", "list_failure_modes", fmsr.list_failure_modes), + ("fmsr", "search_failure_modes", fmsr.search_failure_modes), + ("fmsr", "get_sensor_correlation", fmsr.get_sensor_correlation), + ("fmsr", "get_dga_record", fmsr.get_dga_record), + ("fmsr", "analyze_dga", fmsr.analyze_dga), + ("tsfm", "get_rul", tsfm.get_rul), + ("tsfm", "forecast_rul", tsfm.forecast_rul), + ("tsfm", "detect_anomalies", tsfm.detect_anomalies), + ("tsfm", "trend_analysis", tsfm.trend_analysis), + ("wo", "list_fault_records", wo.list_fault_records), + ("wo", "get_fault_record", wo.get_fault_record), + ("wo", "create_work_order", wo.create_work_order), + ("wo", "list_work_orders", wo.list_work_orders), + ("wo", "update_work_order", wo.update_work_order), + ("wo", "estimate_downtime", wo.estimate_downtime), + ] + + specs: list[ToolSpec] = [] + for domain, bare, fn in catalog: + if not callable(fn): + raise RuntimeError( + f"Expected {domain}.{bare} to be callable after FastMCP " + f"decoration; got {type(fn).__name__}. FastMCP may have " + f"changed its decorator contract; adjust _build_registry." + ) + specs.append( + ToolSpec( + name=f"{domain}.{bare}", + domain=domain, + bare_name=bare, + fn=fn, + doc=_extract_doc(fn), + ) + ) + index = {s.name: s for s in specs} + return specs, index + + +# Lazily initialized so that ``import servers.smart_grid.direct_adapter`` is +# cheap for callers that only want the types. Build on first use. +_TOOLS: list[ToolSpec] | None = None +_TOOLS_BY_NAME: dict[str, ToolSpec] | None = None + + +def _ensure_registry() -> None: + global _TOOLS, _TOOLS_BY_NAME + if _TOOLS is None: + _TOOLS, _TOOLS_BY_NAME = _build_registry() + + +def get_tools() -> list[ToolSpec]: + """Return the full ordered list of ToolSpec entries.""" + _ensure_registry() + assert _TOOLS is not None + return list(_TOOLS) + + +def get_tool(name: str) -> ToolSpec: + """Lookup a ToolSpec by qualified name, e.g. ``iot.get_sensor_readings``.""" + _ensure_registry() + assert _TOOLS_BY_NAME is not None + if name not in _TOOLS_BY_NAME: + raise KeyError(f"unknown tool {name!r}; available: {sorted(_TOOLS_BY_NAME)}") + return _TOOLS_BY_NAME[name] + + +def list_tool_specs_for_llm() -> list[dict[str, Any]]: + """Return a compact, JSON-serializable list of tool descriptors suitable + for prompting an LLM. Intentionally minimal — a ReAct runner can enrich + it further with few-shot examples if needed. + """ + return [ + { + "name": s.name, + "description": s.doc, + "parameters": s.parameters(), + } + for s in get_tools() + ] diff --git a/src/servers/smart_grid/fmsr/__init__.py b/src/servers/smart_grid/fmsr/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/servers/smart_grid/fmsr/main.py b/src/servers/smart_grid/fmsr/main.py new file mode 100644 index 000000000..e9f9c060e --- /dev/null +++ b/src/servers/smart_grid/fmsr/main.py @@ -0,0 +1,371 @@ +"""FMSR MCP server — Failure Mode to Sensor Relation mapping for Smart Grid transformers. + +FMSR = Failure Mode Sensor Relation. Given sensor readings (especially dissolved +gas concentrations), this server helps an agent diagnose which fault type is most +likely and understand which sensors are elevated for each failure mode. + +Tools exposed to the LLM agent: + list_failure_modes — catalogue of all known fault types + search_failure_modes — find fault types matching a keyword or gas name + get_sensor_correlation — which gases/sensors indicate a specific fault + get_dga_record — retrieve a transformer's most recent DGA snapshot + analyze_dga — classify a set of gas concentrations into a fault type + using the IEC 60599 Rogers Ratio method + +Data source: ``$SG_DATA_DIR/failure_modes.csv``, ``dga_records.csv``. +""" + +from __future__ import annotations + +import math + +import pandas as pd +from mcp.server.fastmcp import FastMCP + +from servers.smart_grid.base import ( + json_safe_record, + load_dga_records, + load_failure_modes, +) + +mcp = FastMCP("smart-grid-fmsr") + +_failure_modes: pd.DataFrame | None = None +_dga_records: pd.DataFrame | None = None + + +def _get_failure_modes() -> pd.DataFrame: + global _failure_modes + if _failure_modes is None: + _failure_modes = load_failure_modes() + return _failure_modes + + +def _get_dga_records() -> pd.DataFrame: + global _dga_records + if _dga_records is None: + _dga_records = load_dga_records() + return _dga_records + + +# --------------------------------------------------------------------------- +# Rogers Ratio method (IEC 60599:2022 Table 1) +# --------------------------------------------------------------------------- +# A classic DGA interpretation algorithm. It computes three gas ratios and +# maps them to a fault code via a lookup table. +# +# Ratios (using the JSON's R-numbering convention): +# R1 = CH4 / H2 (IEC Table 1 middle column) +# R2 = C2H2 / C2H4 (IEC Table 1 first column) +# R3 = C2H4 / C2H6 (IEC Table 1 third column) +# +# The lookup table below follows IEC 60599:2022 Table 1 (4th edition, p.13). +# "NS" in the standard ("non-significant whatever the value") is encoded as +# (0, None) on R2 since gas ratios are non-negative. +# +# Order: most-severe first so first-match-wins resolves overlap toward the +# more severe code (e.g., D2 wins over D1 in the overlap region +# R1 ∈ [0.1, 0.5), R2 ∈ [1.0, 2.5), R3 ≥ 2.0). Boundary phrasing here mirrors +# the encoded ranges (min-inclusive, max-exclusive). + +_ROGERS_TABLE = [ + # (R1_range, R2_range, R3_range, code, description) + # Each range is (min_inclusive, max_exclusive); None = no bound. + ((0.1, 1.0), (0.6, 2.5), (2.0, None), "D2", "Discharges of high energy (arcing)"), + ((0.1, 0.5), (1.0, None), (1.0, None), "D1", "Discharges of low energy"), + ((1.0, None), (0, 0.2), (4.0, None), "T3", "Thermal fault, t > 700 °C"), + ((1.0, None), (0, 0.1), (1.0, 4.0), "T2", "Thermal fault, 300 °C < t < 700 °C"), + ((1.0, None), (0, None), (0, 1.0), "T1", "Thermal fault, t < 300 °C"), + ((0, 0.1), (0, None), (0, 0.2), "PD", "Partial discharges"), +] + + +def _in_range(value: float, lo, hi) -> bool: + if lo is not None and value < lo: + return False + if hi is not None and value >= hi: + return False + return True + + +def _ratio(numerator: float, denominator: float) -> float: + """Compute a gas ratio with explicit zero-denominator handling. + + - denominator > 0: numerator / denominator (finite) + - denominator == 0 and numerator > 0: math.inf (a real ratio that diverges) + - denominator == 0 and numerator == 0: 0.0 (genuinely no signal) + + Returning math.inf for a divergent ratio is critical: collapsing it to 0.0 + would silently drop samples into the wrong fault class (e.g., a sample with + nonzero CH4 and C2H4 but zero C2H6 has R3 = +inf and should match D2 if the + other ratios fit, not fall through to N). + """ + if denominator > 0: + return numerator / denominator + return math.inf if numerator > 0 else 0.0 + + +def _ratio_field(value: float) -> tuple: + """Normalize a ratio for JSON-safe outbound serialization. + + Returns (json_safe_value, is_divergent). math.inf becomes None + True, + so `json.dumps(result, allow_nan=False)` succeeds on the public output. + Internal table matching keeps the raw float (including math.inf) — this + helper runs only at the dict-construction boundary. + """ + if math.isinf(value): + return None, True + return round(value, 4), False + + +def _rogers_ratio(h2: float, ch4: float, c2h2: float, c2h4: float, c2h6: float) -> dict: + """Apply Rogers Ratio method; return IEC code and description. + + Output ratio fields are JSON-safe: a divergent ratio (zero denominator, + nonzero numerator) is reported as `null` with a sibling `r{1,2,3}_divergent: true` + flag rather than `inf`. Internal table matching uses the true infinity so + classification is correct. + """ + r1 = _ratio(ch4, h2) + r2 = _ratio(c2h2, c2h4) + r3 = _ratio(c2h4, c2h6) + + # All-zero gases → N. IEC 60599 Table 1 does not address the no-detectable-gas + # case explicitly; PD's R1/R3 ranges include 0 and would otherwise spuriously + # match. Operationally, no measurable gas means no fault, not partial discharge. + if h2 == 0 and ch4 == 0 and c2h2 == 0 and c2h4 == 0 and c2h6 == 0: + return { + "iec_code": "N", + "diagnosis": "Normal / Inconclusive", + "r1_ch4_h2": 0.0, + "r2_c2h2_c2h4": 0.0, + "r3_c2h4_c2h6": 0.0, + } + + for r1_range, r2_range, r3_range, code, description in _ROGERS_TABLE: + if ( + _in_range(r1, *r1_range) + and _in_range(r2, *r2_range) + and _in_range(r3, *r3_range) + ): + return _build_result(code, description, r1, r2, r3) + + return _build_result("N", "Normal / Inconclusive", r1, r2, r3) + + +def _build_result(code: str, description: str, r1: float, r2: float, r3: float) -> dict: + """Build the public analyze_dga result dict with JSON-safe ratio fields.""" + r1_val, r1_div = _ratio_field(r1) + r2_val, r2_div = _ratio_field(r2) + r3_val, r3_div = _ratio_field(r3) + result = { + "iec_code": code, + "diagnosis": description, + "r1_ch4_h2": r1_val, + "r2_c2h2_c2h4": r2_val, + "r3_c2h4_c2h6": r3_val, + } + if r1_div: + result["r1_divergent"] = True + if r2_div: + result["r2_divergent"] = True + if r3_div: + result["r3_divergent"] = True + return result + + +# --------------------------------------------------------------------------- +# Tools +# --------------------------------------------------------------------------- + + +@mcp.tool() +def list_failure_modes() -> list[dict]: + """Return the full catalogue of known transformer failure modes. + + Returns: + List of dicts with keys: failure_mode_id, name, severity, iec_code, + key_gases, recommended_action. + """ + df = _get_failure_modes() + return df[ + [ + "failure_mode_id", + "name", + "severity", + "iec_code", + "key_gases", + "recommended_action", + ] + ].to_dict(orient="records") + + +@mcp.tool() +def search_failure_modes(query: str) -> list[dict]: + """Search failure modes by keyword (name, description, gas, or IEC code). + + Args: + query: Free-text search string, e.g. "arc", "H2", "PD", "overheating". + + Returns: + List of matching failure mode dicts (same schema as list_failure_modes). + Empty list if no matches. + """ + df = _get_failure_modes() + q = query.lower() + mask = ( + df["name"].str.lower().str.contains(q, na=False) + | df["description"].str.lower().str.contains(q, na=False) + | df["key_gases"].str.lower().str.contains(q, na=False) + | df["iec_code"].str.lower().str.contains(q, na=False) + | df["dga_label"].str.lower().str.contains(q, na=False) + ) + return df[mask][ + [ + "failure_mode_id", + "name", + "severity", + "iec_code", + "key_gases", + "recommended_action", + ] + ].to_dict(orient="records") + + +@mcp.tool() +def get_sensor_correlation(failure_mode_id: str) -> dict: + """Return the sensors and gases most strongly associated with a failure mode. + + Args: + failure_mode_id: e.g. "FM-006" (use list_failure_modes to find IDs). + + Returns: + Dict with keys: failure_mode_id, name, key_gases (list), description, + iec_code, recommended_action. + Returns an error dict if the ID is not found. + """ + df = _get_failure_modes() + row = df[df["failure_mode_id"] == failure_mode_id] + if row.empty: + return {"error": f"Failure mode '{failure_mode_id}' not found."} + r = row.iloc[0].to_dict() + r["key_gases"] = [g.strip() for g in r["key_gases"].split(",") if g.strip()] + return r + + +@mcp.tool() +def get_dga_record(transformer_id: str) -> dict: + """Retrieve the most recent dissolved gas analysis (DGA) record for a transformer. + + Args: + transformer_id: Asset identifier, e.g. "T-016". + + Returns: + Dict with gas concentrations (ppm) and the recorded fault label: + transformer_id, sample_date, dissolved_h2_ppm, dissolved_ch4_ppm, + dissolved_c2h2_ppm, dissolved_c2h4_ppm, dissolved_c2h6_ppm, + dissolved_co_ppm, dissolved_co2_ppm, fault_label. + Returns an error dict if not found. + """ + df = _get_dga_records() + row = ( + df[df["transformer_id"] == transformer_id] + # sample_date is stored as ISO YYYY-MM-DD, so lexicographic descending + # order is chronological. + .sort_values("sample_date", ascending=False) + ) + if row.empty: + return {"error": f"No DGA record found for '{transformer_id}'."} + record = row.iloc[0].to_dict() + return json_safe_record(record) + + +@mcp.tool() +def analyze_dga( + h2: float, + ch4: float, + c2h2: float, + c2h4: float, + c2h6: float, + transformer_id: str | None = None, +) -> dict: + """Classify a set of dissolved gas concentrations into a fault type using + the IEC 60599 Rogers Ratio method. + + Given raw gas readings (in ppm), returns the most likely fault classification + and the three diagnostic ratios. + + Args: + h2: Hydrogen concentration (ppm). + ch4: Methane concentration (ppm). + c2h2: Acetylene concentration (ppm). + c2h4: Ethylene concentration (ppm). + c2h6: Ethane concentration (ppm). + transformer_id: Optional — if provided, included in the result for + traceability. + + Returns: + Dict with keys: + transformer_id (if provided), iec_code, diagnosis, + r1_ch4_h2, r2_c2h2_c2h4, r3_c2h4_c2h6, + input_gases (echo of inputs). + + Ratio fields are always JSON-safe: a divergent ratio (zero + denominator with nonzero numerator) is reported as `null` plus a + sibling `r{1,2,3}_divergent: true` flag, never as a non-finite float. + Finite-ratio results omit the `*_divergent` keys entirely. + """ + # Coerce to float: LLMs sometimes pass numeric args as strings even when + # the tool schema declares "type": "number". + try: + h2, ch4, c2h2, c2h4, c2h6 = ( + float(h2), + float(ch4), + float(c2h2), + float(c2h4), + float(c2h6), + ) + except (TypeError, ValueError) as exc: + return {"error": f"Gas values must be numeric: {exc}"} + inputs = { + "h2_ppm": h2, + "ch4_ppm": ch4, + "c2h2_ppm": c2h2, + "c2h4_ppm": c2h4, + "c2h6_ppm": c2h6, + } + negative_inputs = {name: value for name, value in inputs.items() if value < 0} + if negative_inputs: + return { + "error": "Gas concentrations must be non-negative.", + "invalid_inputs": negative_inputs, + } + + invalid_number_inputs = { + name: value for name, value in inputs.items() if not math.isfinite(value) + } + if invalid_number_inputs: + return { + "error": "Gas concentrations must be finite numbers.", + "invalid_inputs": invalid_number_inputs, + } + + result = _rogers_ratio(h2, ch4, c2h2, c2h4, c2h6) + result["input_gases"] = inputs + if transformer_id: + result["transformer_id"] = transformer_id + return result + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + + +def main() -> None: + """CLI entry point — runs the FastMCP stdio JSON-RPC loop.""" + mcp.run() + + +if __name__ == "__main__": + main() diff --git a/src/servers/smart_grid/iot/__init__.py b/src/servers/smart_grid/iot/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/servers/smart_grid/iot/main.py b/src/servers/smart_grid/iot/main.py new file mode 100644 index 000000000..e1311edab --- /dev/null +++ b/src/servers/smart_grid/iot/main.py @@ -0,0 +1,184 @@ +"""IoT MCP server — sensor telemetry and asset metadata for Smart Grid transformers. + +Tools exposed to the LLM agent: + list_assets — list all transformer assets (optionally filter by status) + get_asset_metadata — static nameplate info for one transformer + list_sensors — which sensor IDs exist for a transformer + get_sensor_readings — time-series readings for one sensor + +Data source: ``$SG_DATA_DIR/asset_metadata.csv``, ``sensor_readings.csv``. +""" + +from __future__ import annotations + +import pandas as pd +from mcp.server.fastmcp import FastMCP + +from servers.smart_grid.base import load_asset_metadata, load_sensor_readings + +mcp = FastMCP("smart-grid-iot") + +# Module-level data cache. Loaded once at first tool-call, then reused. +_metadata: pd.DataFrame | None = None +_readings: pd.DataFrame | None = None + + +def _get_metadata() -> pd.DataFrame: + global _metadata + if _metadata is None: + _metadata = load_asset_metadata() + return _metadata + + +def _get_readings() -> pd.DataFrame: + global _readings + if _readings is None: + _readings = load_sensor_readings() + return _readings + + +# --------------------------------------------------------------------------- +# Tools +# --------------------------------------------------------------------------- + + +@mcp.tool() +def list_assets(health_status: int | None = None) -> list[dict]: + """List all Smart Grid transformer assets. + + Args: + health_status: Optional filter. 0 = healthy, 1 = degraded, 2 = critical. + Omit to return all assets. + + Returns: + List of dicts, each with keys: + transformer_id, name, location, health_status, rul_days, in_service + """ + df = _get_metadata() + if health_status is not None: + df = df[df["health_status"] == health_status] + + return df[ + [ + "transformer_id", + "name", + "location", + "health_status", + "rul_days", + "in_service", + ] + ].to_dict(orient="records") + + +@mcp.tool() +def get_asset_metadata(transformer_id: str) -> dict: + """Return full nameplate and status metadata for a single transformer. + + Args: + transformer_id: Asset identifier, e.g. "T-001". + + Returns: + Dict with keys: transformer_id, name, manufacturer, location, + voltage_class, rating_kva, install_date, age_years, health_status, + fdd_category, rul_days, in_service. + Returns an error dict if the ID is not found. + """ + df = _get_metadata() + row = df[df["transformer_id"] == transformer_id] + if row.empty: + return {"error": f"Transformer '{transformer_id}' not found."} + return row.iloc[0].to_dict() + + +@mcp.tool() +def list_sensors(transformer_id: str) -> list[dict]: + """List all sensor IDs available for a given transformer. + + Args: + transformer_id: Asset identifier, e.g. "T-001". + + Returns: + List of dicts with keys: sensor_id, unit, num_readings. + Returns an error dict ({"error": ...}) if the transformer ID is not found. + """ + df = _get_readings() + subset = df[df["transformer_id"] == transformer_id] + if subset.empty: + return {"error": f"No sensor data found for '{transformer_id}'."} + + summary = ( + subset.groupby(["sensor_id", "unit"], dropna=False) + .size() + .reset_index(name="num_readings") + ) + summary["unit"] = summary["unit"].fillna("") + return summary.to_dict(orient="records") + + +@mcp.tool() +def get_sensor_readings( + transformer_id: str, + sensor_id: str, + start_time: str | None = None, + end_time: str | None = None, + limit: int = 100, +) -> list[dict]: + """Return time-series readings for one sensor on one transformer. + + Args: + transformer_id: Asset identifier, e.g. "T-001". + sensor_id: Sensor name, e.g. "dga_h2_ppm" or "winding_temp_c". + Use list_sensors() to discover valid sensor IDs. + start_time: ISO-8601 datetime string (inclusive). Optional. + end_time: ISO-8601 datetime string (inclusive). Optional. + limit: Maximum number of rows to return (default 100, max 1000). + + Returns: + List of dicts with keys: timestamp, value, unit. + Sorted ascending by timestamp. + Returns an error list if the transformer or sensor is not found. + """ + df = _get_readings() + subset = df[ + (df["transformer_id"] == transformer_id) & (df["sensor_id"] == sensor_id) + ].copy() + + if subset.empty: + return [ + { + "error": f"No readings found for transformer='{transformer_id}' " + f"sensor='{sensor_id}'." + } + ] + + subset["timestamp"] = pd.to_datetime(subset["timestamp"]) + + if start_time: + subset = subset[subset["timestamp"] >= pd.to_datetime(start_time)] + if end_time: + subset = subset[subset["timestamp"] <= pd.to_datetime(end_time)] + + subset = subset.sort_values("timestamp").head(min(limit, 1000)) + + timestamps = subset["timestamp"].map( + lambda ts: None if pd.isna(ts) else ts.isoformat() + ) + return ( + subset[["timestamp", "value", "unit"]] + .assign(timestamp=timestamps) + .to_dict(orient="records") + ) + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + + +def main() -> None: + """CLI entry point — runs the FastMCP stdio JSON-RPC loop.""" + mcp.run() + + +if __name__ == "__main__": + main() diff --git a/src/servers/smart_grid/tests/__init__.py b/src/servers/smart_grid/tests/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/servers/smart_grid/tests/test_direct_adapter.py b/src/servers/smart_grid/tests/test_direct_adapter.py new file mode 100644 index 000000000..dfcab70f2 --- /dev/null +++ b/src/servers/smart_grid/tests/test_direct_adapter.py @@ -0,0 +1,77 @@ +"""Direct-adapter contract tests. + +These tests don't touch CSVs (data files may not exist on the AOB-side dev +box). They only exercise the adapter's tool-registration, parameter +introspection, and JSON-schema-ish output, all of which are pure-Python +operations on the imported server modules. +""" + +from __future__ import annotations + +import pytest + +from servers.smart_grid.direct_adapter import ( + ToolSpec, + get_tool, + get_tools, + list_tool_specs_for_llm, +) + + +def test_get_tools_returns_19_tools(): + tools = get_tools() + assert len(tools) == 19 + + +def test_get_tools_returns_toolspec_instances(): + for t in get_tools(): + assert isinstance(t, ToolSpec) + assert t.name and "." in t.name + assert t.domain in {"iot", "fmsr", "tsfm", "wo"} + assert t.bare_name + assert callable(t.fn) + + +def test_tools_split_by_domain(): + by_domain = {} + for t in get_tools(): + by_domain.setdefault(t.domain, []).append(t.name) + assert sorted(by_domain) == ["fmsr", "iot", "tsfm", "wo"] + assert len(by_domain["iot"]) == 4 + assert len(by_domain["fmsr"]) == 5 + assert len(by_domain["tsfm"]) == 4 + assert len(by_domain["wo"]) == 6 + + +def test_get_tool_by_name(): + t = get_tool("iot.list_assets") + assert t.domain == "iot" + assert t.bare_name == "list_assets" + + +def test_get_tool_unknown_raises_keyerror(): + with pytest.raises(KeyError): + get_tool("nonexistent.tool") + + +def test_parameters_extract_signature_correctly(): + t = get_tool("iot.get_sensor_readings") + params = t.parameters() + assert "transformer_id" in params + assert params["transformer_id"].get("required") is True + assert "limit" in params + assert params["limit"].get("default") == 100 + + +def test_list_tool_specs_for_llm_shape(): + specs = list_tool_specs_for_llm() + assert len(specs) == 19 + for s in specs: + assert set(s.keys()) >= {"name", "description", "parameters"} + + +def test_doc_first_paragraph_extracted(): + t = get_tool("iot.list_assets") + assert t.doc.startswith("List all Smart Grid") + # First paragraph only — should NOT contain "Args:" subsection. + assert "Args:" not in t.doc diff --git a/src/servers/smart_grid/tests/test_fmsr.py b/src/servers/smart_grid/tests/test_fmsr.py new file mode 100644 index 000000000..aeaa380f2 --- /dev/null +++ b/src/servers/smart_grid/tests/test_fmsr.py @@ -0,0 +1,149 @@ +"""IEC 60599:2022 + JSON-safe divergent-ratio regression tests for the FMSR port. + +These tests target only the gas-ratio classification surface (`analyze_dga`), +which has no dependency on `$SG_DATA_DIR` CSV fixtures. Tests that exercise +`list_failure_modes`, `search_failure_modes`, `get_sensor_correlation`, and +`get_dga_record` are intentionally not ported here because the processed-CSV +data port is deferred (see aob-extraction_deferred.md D2) — they will be +added when SG_DATA_DIR is populated. + +DGA contract assumptions (mirrored from the SmartGridBench source project): + - analyze_dga is fully deterministic. + - IEC code "N" / "Normal / Inconclusive" is a valid output, not an error. + - Rogers table follows IEC 60599:2022 Table 1 strictly. D2 ("Discharges of + high energy / arcing") requires R2 ∈ [0.6, 2.5) AND R3 ≥ 2.0 AND + R1 ∈ [0.1, 1.0). Samples with R2 ≥ 2.5 fall outside D2 and (if + R1 ∈ [0.1, 0.5) and R2 ≥ 1.0 and R3 ≥ 1.0) classify as D1 instead. + - Divergent ratios (zero denominator with nonzero numerator) are reported as + `null` + sibling `r{1,2,3}_divergent: true`, never as a non-finite float. +""" + +from __future__ import annotations + +import json as _json + +from servers.smart_grid.fmsr import main as fmsr +from servers.smart_grid.fmsr.main import analyze_dga + +# T-018 representative profile from the source project's processed DGA records: +# R1 = 6/35 = 0.17, R2 = 482/26 = 18.5, R3 = 26/3 = 8.67. +_T018_GASES = dict(h2=35.0, ch4=6.0, c2h2=482.0, c2h4=26.0, c2h6=3.0) + + +def test_analyze_dga_returns_required_fields(): + result = analyze_dga(**_T018_GASES, transformer_id="T-018") + for key in ( + "iec_code", + "diagnosis", + "r1_ch4_h2", + "r2_c2h2_c2h4", + "r3_c2h4_c2h6", + "input_gases", + ): + assert key in result, f"Missing field: {key}" + assert result["transformer_id"] == "T-018" + + +def test_analyze_dga_echoes_inputs(): + result = analyze_dga(**_T018_GASES) + gases = result["input_gases"] + assert gases["h2_ppm"] == _T018_GASES["h2"] + assert gases["c2h2_ppm"] == _T018_GASES["c2h2"] + + +def test_analyze_dga_deterministic(): + r1 = analyze_dga(**_T018_GASES) + r2 = analyze_dga(**_T018_GASES) + assert r1["iec_code"] == r2["iec_code"] + assert r1["diagnosis"] == r2["diagnosis"] + + +def test_analyze_dga_high_c2h2_ratio_is_d1_per_iec_strict(): + # T-018 profile (R1=0.17, R2=18.5, R3=8.67) classifies as D1 under + # IEC 60599:2022 Table 1: R2=18.5 falls outside D2's [0.6, 2.5) cap. + result = analyze_dga(**_T018_GASES) + assert result["iec_code"] == "D1" + + +def test_analyze_dga_all_zeros_no_crash(): + result = analyze_dga(h2=0, ch4=0, c2h2=0, c2h4=0, c2h6=0) + assert "iec_code" in result + assert result["iec_code"] == "N" + + +def test_analyze_dga_zero_c2h6_diverges_r3(): + # Regression: zero denominator must produce a divergent ratio internally + # (so classification is correct), but the public output normalizes inf → + # null + r3_divergent: True for JSON safety. + result = analyze_dga(h2=500, ch4=200, c2h2=120, c2h4=100, c2h6=0) + assert result["iec_code"] == "D2" + assert result["r3_c2h4_c2h6"] is None + assert result.get("r3_divergent") is True + _json.dumps(result, allow_nan=False) + + +def test_analyze_dga_zero_c2h4_diverges_r2(): + # c2h4=0, c2h2>0 → R2 diverges; R3 collapses to 0.0 (c2h4=0, c2h6>0). + # No fault row matches → N. Public output: r2_c2h2_c2h4 → null + flag. + result = analyze_dga(h2=500, ch4=200, c2h2=120, c2h4=0, c2h6=30) + assert result["iec_code"] == "N" + assert result["r2_c2h2_c2h4"] is None + assert result.get("r2_divergent") is True + _json.dumps(result, allow_nan=False) + + +def test_analyze_dga_zero_h2_diverges_r1(): + # h2=0, ch4>0 → R1 diverges. R2=0.025, R3=0.667 → T1. + result = analyze_dga(h2=0, ch4=200, c2h2=2, c2h4=80, c2h6=120) + assert result["iec_code"] == "T1" + assert result["r1_ch4_h2"] is None + assert result.get("r1_divergent") is True + _json.dumps(result, allow_nan=False) + + +def test_analyze_dga_finite_ratios_have_no_divergent_flags(): + # Non-regression: finite-ratio results must NOT carry r{1,2,3}_divergent + # keys at all. + result = analyze_dga(**_T018_GASES) + assert "r1_divergent" not in result + assert "r2_divergent" not in result + assert "r3_divergent" not in result + _json.dumps(result, allow_nan=False) + + +def test_analyze_dga_without_transformer_id(): + result = analyze_dga(**_T018_GASES) + assert "transformer_id" not in result + + +def test_analyze_dga_negative_input_rejected(): + result = analyze_dga(h2=-1, ch4=200, c2h2=2, c2h4=80, c2h6=120) + assert "error" in result + assert "invalid_inputs" in result + + +def test_analyze_dga_string_inputs_coerced(): + # LLM tool clients sometimes stringify numeric args. + result = analyze_dga(h2="35", ch4="6", c2h2="482", c2h4="26", c2h6="3") + assert result["iec_code"] == "D1" + + +def test_get_dga_record_sample_date_is_json_safe(tmp_path, monkeypatch): + (tmp_path / "dga_records.csv").write_text( + "\n".join( + [ + "transformer_id,sample_date,dissolved_h2_ppm,dissolved_ch4_ppm," + "dissolved_c2h2_ppm,dissolved_c2h4_ppm,dissolved_c2h6_ppm," + "dissolved_co_ppm,dissolved_co2_ppm,fault_label,source_dataset", + "T-001,2026-01-02,10,20,1,30,40,100,200,T1,unit-test", + ] + ), + encoding="utf-8", + ) + monkeypatch.setenv("SG_DATA_DIR", str(tmp_path)) + monkeypatch.setattr(fmsr, "_dga_records", None) + + result = fmsr.get_dga_record("T-001") + + assert result["sample_date"] == "2026-01-02" + _json.dumps(result, allow_nan=False) diff --git a/src/servers/smart_grid/tests/test_json_safety.py b/src/servers/smart_grid/tests/test_json_safety.py new file mode 100644 index 000000000..c84d4d719 --- /dev/null +++ b/src/servers/smart_grid/tests/test_json_safety.py @@ -0,0 +1,147 @@ +"""Tool-level JSON-safety smoke for all Smart Grid MCP servers. + +Every ``@mcp.tool()``-decorated callable returns over MCP's JSON-RPC +transport, so its output must serialize under +``json.dumps(..., allow_nan=False)``. This test exercises every tool against +a hermetic ``SG_DATA_DIR`` fixture and asserts strict JSON serialization +succeeds. Catches the class of bug fixed in ``fmsr.get_dga_record`` (pandas +``Timestamp`` leaking through ``to_dict()``) across all current and future +Smart Grid tools. +""" + +from __future__ import annotations + +import json + +import pytest + +from servers.smart_grid.fmsr import main as fmsr +from servers.smart_grid.iot import main as iot +from servers.smart_grid.tsfm import main as tsfm +from servers.smart_grid.wo import main as wo + +_ASSET_METADATA_CSV = "\n".join( + [ + "transformer_id,name,manufacturer,location,voltage_class,rating_kva,install_date,age_years,health_status,fdd_category,rul_days,in_service", + "T-001,Unit 1,Acme,Site A,138kV,50000,2018-03-15,8,healthy,normal,2400,True", + "T-002,Unit 2,Acme,Site B,138kV,50000,2017-06-01,9,degraded,attention,1200,True", + ] +) + +_SENSOR_READINGS_CSV = "\n".join( + [ + "transformer_id,timestamp,sensor_id,value,unit,source", + "T-001,2026-01-01T00:00:00,winding_temp_c,55.2,celsius,sim", + "T-001,2026-01-02T00:00:00,winding_temp_c,56.1,celsius,sim", + "T-001,2026-01-03T00:00:00,winding_temp_c,57.0,celsius,sim", + ] +) + +_FAILURE_MODES_CSV = "\n".join( + [ + "failure_mode_id,name,dga_label,description,severity,iec_code,key_gases,recommended_action", + "FM-001,Thermal Fault T1,T1,Low temperature thermal fault,low,IEC-60599-T1,CH4|C2H6,monitor", + "FM-002,Arc Discharge,D2,High energy arc discharge,critical,IEC-60599-D2,C2H2|H2,immediate inspection", + ] +) + +_DGA_RECORDS_CSV = "\n".join( + [ + "transformer_id,sample_date,dissolved_h2_ppm,dissolved_ch4_ppm,dissolved_c2h2_ppm,dissolved_c2h4_ppm,dissolved_c2h6_ppm,dissolved_co_ppm,dissolved_co2_ppm,fault_label,source_dataset", + "T-001,2026-01-02,10,20,1,30,40,100,200,T1,unit-test", + ] +) + +_RUL_LABELS_CSV = "\n".join( + [ + "transformer_id,timestamp,rul_days,health_index,fdd_category", + "T-001,2026-01-01T00:00:00,2400,0.92,0", + "T-001,2026-01-15T00:00:00,2390,0.91,0", + ] +) + +_FAULT_RECORDS_CSV = "\n".join( + [ + "transformer_id,fault_id,fault_type,location,voltage_v,current_a,power_load_mw,temperature_c,wind_speed_kmh,weather_condition,maintenance_status,component_health,duration_hrs,downtime_hrs", + "T-001,F001,Thermal Fault,Site A,138000,200,30,55,5,clear,Pending,degraded,4,2", + ] +) + + +@pytest.fixture +def sg_data_dir(tmp_path, monkeypatch): + """Hermetic SG_DATA_DIR with minimal CSVs + module cache reset across all four servers.""" + (tmp_path / "asset_metadata.csv").write_text(_ASSET_METADATA_CSV, encoding="utf-8") + (tmp_path / "sensor_readings.csv").write_text( + _SENSOR_READINGS_CSV, encoding="utf-8" + ) + (tmp_path / "failure_modes.csv").write_text(_FAILURE_MODES_CSV, encoding="utf-8") + (tmp_path / "dga_records.csv").write_text(_DGA_RECORDS_CSV, encoding="utf-8") + (tmp_path / "rul_labels.csv").write_text(_RUL_LABELS_CSV, encoding="utf-8") + (tmp_path / "fault_records.csv").write_text(_FAULT_RECORDS_CSV, encoding="utf-8") + monkeypatch.setenv("SG_DATA_DIR", str(tmp_path)) + monkeypatch.setattr(iot, "_metadata", None) + monkeypatch.setattr(iot, "_readings", None) + monkeypatch.setattr(fmsr, "_failure_modes", None) + monkeypatch.setattr(fmsr, "_dga_records", None) + monkeypatch.setattr(tsfm, "_rul", None) + monkeypatch.setattr(tsfm, "_readings", None) + monkeypatch.setattr(wo, "_fault_records", None) + monkeypatch.setattr(wo, "_asset_metadata", None) + return tmp_path + + +@pytest.mark.parametrize( + "label, call", + [ + ("iot.list_assets", lambda: iot.list_assets()), + ("iot.get_asset_metadata", lambda: iot.get_asset_metadata("T-001")), + ("iot.list_sensors", lambda: iot.list_sensors("T-001")), + ( + "iot.get_sensor_readings", + lambda: iot.get_sensor_readings("T-001", "winding_temp_c", limit=10), + ), + ("fmsr.list_failure_modes", lambda: fmsr.list_failure_modes()), + ("fmsr.search_failure_modes", lambda: fmsr.search_failure_modes("thermal")), + ( + "fmsr.get_sensor_correlation", + lambda: fmsr.get_sensor_correlation("FM-001"), + ), + ("fmsr.get_dga_record", lambda: fmsr.get_dga_record("T-001")), + ( + "fmsr.analyze_dga", + lambda: fmsr.analyze_dga( + h2=10, ch4=20, c2h2=1, c2h4=30, c2h6=40, transformer_id="T-001" + ), + ), + ("tsfm.get_rul", lambda: tsfm.get_rul("T-001")), + ( + "tsfm.forecast_rul", + lambda: tsfm.forecast_rul("T-001", horizon_days=30), + ), + ( + "tsfm.detect_anomalies", + lambda: tsfm.detect_anomalies("T-001", "winding_temp_c"), + ), + ( + "tsfm.trend_analysis", + lambda: tsfm.trend_analysis("T-001", "winding_temp_c"), + ), + ("wo.list_fault_records", lambda: wo.list_fault_records(limit=5)), + ("wo.get_fault_record", lambda: wo.get_fault_record("F001")), + ( + "wo.create_work_order", + lambda: wo.create_work_order("T-001", "test issue"), + ), + ], +) +def test_tool_output_is_strict_json_safe(sg_data_dir, label, call): + """Every Smart Grid tool's return value must pass strict JSON serialization. + + ``json.dumps(..., allow_nan=False)`` is the contract the FastMCP JSON-RPC + transport enforces on tool responses. This test catches the class of bug + fixed in ``fmsr.get_dga_record`` (pandas ``Timestamp`` leaking through + ``to_dict()``) for any tool, current or future. + """ + result = call() + json.dumps(result, allow_nan=False) diff --git a/src/servers/smart_grid/tests/test_scenarios.py b/src/servers/smart_grid/tests/test_scenarios.py new file mode 100644 index 000000000..8c4c7fd8a --- /dev/null +++ b/src/servers/smart_grid/tests/test_scenarios.py @@ -0,0 +1,82 @@ +"""Smart Grid scenario JSON shape tests.""" + +from __future__ import annotations + +import json +from pathlib import Path + +_SCENARIOS_DIR = Path(__file__).resolve().parents[3] / "scenarios" / "local" +_REQUIRED_FIELDS = { + "id", + "type", + "text", + "category", + "characteristic_form", + "expected_tools", + "ground_truth", + "difficulty", + "domain_tags", +} +_NEGATIVE_REQUIRED_FIELDS = { + "id", + "type", + "text", + "category", + "characteristic_form", + "expected_tools", + "domain_tags", +} +_VALID_TYPES = {"FMSR", "IoT", "Multi", "TSFM", "WO"} +_VALID_DIFFICULTIES = {"easy", "medium", "hard"} +_VALID_TOOL_PREFIXES = ("fmsr.", "iot.", "tsfm.", "wo.") + + +def _load(filename: str) -> list[dict]: + path = _SCENARIOS_DIR / filename + raw = json.loads(path.read_text(encoding="utf-8")) + assert isinstance(raw, list), f"{filename} must be a JSON array" + return raw + + +def test_smart_grid_scenarios_count(): + records = _load("smart_grid.json") + # AOB-FMSR-001 is the original AOB-style scenario. The other 35 + # records are SGT-NNN ports from the HPML Smart Grid MCP scenario corpus. + assert len(records) == 36 + + +def test_smart_grid_negative_checks_count(): + records = _load("smart_grid_negative_checks.json") + assert len(records) == 5 + + +def test_smart_grid_scenarios_have_expected_shape(): + for raw in _load("smart_grid.json"): + missing = _REQUIRED_FIELDS - set(raw) + assert not missing, f"{raw.get('id', '')} missing {sorted(missing)}" + assert raw["type"] in _VALID_TYPES + assert raw["difficulty"] in _VALID_DIFFICULTIES + assert isinstance(raw["text"], str) and raw["text"].strip() + assert ( + isinstance(raw["characteristic_form"], str) + and raw["characteristic_form"].strip() + ) + assert isinstance(raw["expected_tools"], list) and raw["expected_tools"] + assert isinstance(raw["ground_truth"], dict) and raw["ground_truth"] + assert isinstance(raw["domain_tags"], list) and raw["domain_tags"] + for tool_name in raw["expected_tools"]: + assert tool_name.startswith(_VALID_TOOL_PREFIXES), tool_name + + +def test_smart_grid_negative_checks_have_expected_shape(): + for raw in _load("smart_grid_negative_checks.json"): + missing = _NEGATIVE_REQUIRED_FIELDS - set(raw) + assert not missing, f"{raw.get('id', '')} missing {sorted(missing)}" + assert raw["id"].startswith("SG-NEG-") + + +def test_smart_grid_scenario_ids_unique(): + main = _load("smart_grid.json") + neg = _load("smart_grid_negative_checks.json") + ids = [r["id"] for r in main + neg] + assert len(ids) == len(set(ids)), f"duplicate scenario IDs detected: {ids}" diff --git a/src/servers/smart_grid/tsfm/__init__.py b/src/servers/smart_grid/tsfm/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/servers/smart_grid/tsfm/main.py b/src/servers/smart_grid/tsfm/main.py new file mode 100644 index 000000000..dc98ac62a --- /dev/null +++ b/src/servers/smart_grid/tsfm/main.py @@ -0,0 +1,327 @@ +"""TSFM MCP server — Time-Series Forecasting and anomaly detection for Smart Grid transformers. + +TSFM = Time-Series Foundation Model. In the full project this server will call +a fine-tuned TSFM (e.g., Chronos or a Llama-based forecaster served via vLLM). +This skeleton implements lightweight statistical baselines so the server is +functional end-to-end for scenario testing before the model is integrated. + +Baseline methods used here: + - RUL forecast: returns the label from rul_labels.csv + a linear projection + - Anomaly detection: z-score over a rolling window + - Trend analysis: linear regression slope over a requested time period + +Tools exposed to the LLM agent: + get_rul — current remaining useful life estimate for a transformer + forecast_rul — project RUL N days into the future + detect_anomalies — flag sensor readings that exceed a z-score threshold + trend_analysis — slope and direction of a sensor over a time window + +Data source: ``$SG_DATA_DIR/rul_labels.csv``, ``sensor_readings.csv``. +""" + +from __future__ import annotations + +import numpy as np +import pandas as pd +from mcp.server.fastmcp import FastMCP + +from servers.smart_grid.base import load_rul_labels, load_sensor_readings + +# Health-index saturation point (~3 years; HI = 1.0 above this RUL). Inlined +# here from the team-repo's ``data/constants.py`` to keep this server self- +# contained. +HI_FULL_HEALTH_DAYS = 1093.0 + +mcp = FastMCP("smart-grid-tsfm") + +_rul: pd.DataFrame | None = None +_readings: pd.DataFrame | None = None + + +def _confidence_from_history(num_points: int, horizon_days: int = 0) -> float: + """Bounded baseline confidence for deterministic synthetic labels.""" + history_factor = min(0.99, 0.55 + num_points / 120) + horizon_penalty = min(0.45, max(0, horizon_days) / 900) + return round(max(0.1, history_factor - horizon_penalty), 3) + + +def _get_rul() -> pd.DataFrame: + global _rul + if _rul is None: + _rul = load_rul_labels() + return _rul + + +def _get_readings() -> pd.DataFrame: + global _readings + if _readings is None: + _readings = load_sensor_readings() + return _readings + + +# --------------------------------------------------------------------------- +# Tools +# --------------------------------------------------------------------------- + + +@mcp.tool() +def get_rul(transformer_id: str) -> dict: + """Return the most recent remaining useful life (RUL) estimate for a transformer. + + RUL is sourced from the FDD & RUL dataset labels and represents the number + of days of useful operation remaining before the transformer is expected to + require major maintenance or replacement. + + Args: + transformer_id: Asset identifier, e.g. "T-018". + + Returns: + Dict with keys: transformer_id, as_of_date, rul_days, health_index, + fdd_category, interpretation. + Returns an error dict if not found. + """ + df = _get_rul() + subset = df[df["transformer_id"] == transformer_id].sort_values("timestamp") + if subset.empty: + return {"error": f"No RUL data found for '{transformer_id}'."} + + latest = subset.iloc[-1] + rul_days = int(latest["rul_days"]) + fdd_category = ( + int(latest["fdd_category"]) if pd.notna(latest["fdd_category"]) else None + ) + as_of_ts = latest["timestamp"] + + if rul_days >= 730: + interpretation = "Healthy — no immediate action required." + elif rul_days >= 180: + interpretation = "Aging — schedule routine inspection within 6 months." + elif rul_days >= 30: + interpretation = "Degraded — maintenance recommended within 30 days." + else: + interpretation = "Critical — immediate inspection required." + + return { + "transformer_id": transformer_id, + "as_of_date": str(as_of_ts.date()) if pd.notna(as_of_ts) else None, + "rul_days": rul_days, + "health_index": round(float(latest["health_index"]), 4), + "fdd_category": fdd_category, + "confidence": _confidence_from_history(len(subset)), + "interpretation": interpretation, + } + + +@mcp.tool() +def forecast_rul(transformer_id: str, horizon_days: int = 30) -> dict: + """Project the RUL forward by ``horizon_days`` using a linear degradation model. + + This is a statistical baseline. The full project may replace this with a + TSFM (Time-Series Foundation Model) inference call via vLLM. + + Args: + transformer_id: Asset identifier, e.g. "T-007". + horizon_days: Number of days to project forward (default 30, max 365). + + Returns: + Dict with keys: transformer_id, current_rul_days, forecast_date, + projected_rul_days, projected_health_index, confidence, method. + """ + df = _get_rul() + subset = df[df["transformer_id"] == transformer_id].sort_values("timestamp") + if subset.empty: + return {"error": f"No RUL data found for '{transformer_id}'."} + + if horizon_days < 0 or horizon_days > 365: + return { + "error": "horizon_days must be between 0 and 365.", + "provided_horizon_days": horizon_days, + } + latest = subset.iloc[-1] + current_rul = int(latest["rul_days"]) + latest_ts = latest["timestamp"] + + # Linear model: assume 1 RUL-day consumed per calendar day. + projected_rul = max(0, current_rul - horizon_days) + forecast_date = ( + pd.to_datetime(latest_ts) + pd.Timedelta(days=horizon_days) + if pd.notna(latest_ts) + else None + ) + projected_hi = ( + 0.0 + if current_rul <= 0 + else round(min(1.0, projected_rul / HI_FULL_HEALTH_DAYS), 4) + ) + + return { + "transformer_id": transformer_id, + "current_rul_days": current_rul, + "forecast_date": ( + str(forecast_date.date()) if forecast_date is not None else None + ), + "projected_rul_days": projected_rul, + "projected_health_index": projected_hi, + "confidence": _confidence_from_history(len(subset), horizon_days), + "method": "linear_degradation_baseline", + } + + +@mcp.tool() +def detect_anomalies( + transformer_id: str, + sensor_id: str, + window_size: int = 24, + z_threshold: float = 3.0, +) -> dict: + """Detect anomalous sensor readings using a rolling z-score method. + + A reading is flagged as anomalous when it deviates more than ``z_threshold`` + standard deviations from the rolling mean over ``window_size`` readings. + + Args: + transformer_id: Asset identifier, e.g. "T-016". + sensor_id: Sensor to analyse, e.g. "dga_h2_ppm". + window_size: Rolling window size in readings (default 24). + z_threshold: Number of standard deviations to flag (default 3.0). + + Returns: + Dict with keys: transformer_id, sensor_id, total_readings, + anomaly_count, anomaly_rate_pct, anomalies (list of flagged readings + with timestamp, value, z_score). + """ + df = _get_readings() + subset = ( + df[(df["transformer_id"] == transformer_id) & (df["sensor_id"] == sensor_id)] + .copy() + .sort_values("timestamp") + ) + + if subset.empty: + return { + "error": f"No readings for transformer='{transformer_id}' " + f"sensor='{sensor_id}'." + } + + vals = subset["value"].astype(float) + rolling_mean = vals.rolling(window_size, min_periods=1).mean() + rolling_std = vals.rolling(window_size, min_periods=1).std() + rolling_std = rolling_std.where( + rolling_std > 0, other=rolling_mean.abs() * 0.001 + 1e-3 + ) + z_scores = ((vals - rolling_mean) / rolling_std).abs() + + anomaly_mask = z_scores > z_threshold + anomalies_df = subset[anomaly_mask][["timestamp", "value"]].copy() + anomalies_df["z_score"] = z_scores[anomaly_mask].round(3).values + + anomalies = anomalies_df.head(50).to_dict(orient="records") + for row in anomalies: + row["timestamp"] = str(row["timestamp"]) + + return { + "transformer_id": transformer_id, + "sensor_id": sensor_id, + "window_size": window_size, + "z_threshold": z_threshold, + "total_readings": len(subset), + "anomaly_count": int(anomaly_mask.sum()), + "anomaly_rate_pct": round(100 * anomaly_mask.mean(), 2), + "anomalies": anomalies, + } + + +@mcp.tool() +def trend_analysis( + transformer_id: str, + sensor_id: str, + start_time: str | None = None, + end_time: str | None = None, +) -> dict: + """Compute the trend (slope) of a sensor's readings over a time window. + + Uses ordinary least squares linear regression over the selected readings. + A positive slope means the sensor value is increasing over time. + + Args: + transformer_id: Asset identifier, e.g. "T-012". + sensor_id: Sensor to analyse, e.g. "dga_c2h2_ppm". + start_time: ISO-8601 start of window (optional). + end_time: ISO-8601 end of window (optional). + + Returns: + Dict with keys: transformer_id, sensor_id, num_readings, + start_time, end_time, mean_value, min_value, max_value, + slope_per_day, direction ("increasing"/"decreasing"/"stable"), + r_squared. + """ + df = _get_readings() + subset = df[ + (df["transformer_id"] == transformer_id) & (df["sensor_id"] == sensor_id) + ].copy() + + if subset.empty: + return { + "error": f"No readings for transformer='{transformer_id}' " + f"sensor='{sensor_id}'." + } + + subset["timestamp"] = pd.to_datetime(subset["timestamp"]) + if start_time: + subset = subset[subset["timestamp"] >= pd.to_datetime(start_time)] + if end_time: + subset = subset[subset["timestamp"] <= pd.to_datetime(end_time)] + + subset = subset.sort_values("timestamp") + if len(subset) < 2: + return { + "error": "Not enough readings in the specified window for trend analysis." + } + + t0 = subset["timestamp"].iloc[0] + days = (subset["timestamp"] - t0).dt.total_seconds() / 86400 + vals = subset["value"].astype(float) + + coeffs = np.polyfit(days, vals, 1) + slope = float(coeffs[0]) + y_hat = np.polyval(coeffs, days) + ss_res = float(np.sum((vals - y_hat) ** 2)) + ss_tot = float(np.sum((vals - vals.mean()) ** 2)) + r2 = 1 - ss_res / ss_tot if ss_tot > 0 else 0.0 + + mean_val = float(vals.mean()) + rel_slope = abs(slope) / (mean_val + 1e-9) + if rel_slope < 0.01: + direction = "stable" + elif slope > 0: + direction = "increasing" + else: + direction = "decreasing" + + return { + "transformer_id": transformer_id, + "sensor_id": sensor_id, + "num_readings": len(subset), + "start_time": str(subset["timestamp"].iloc[0]), + "end_time": str(subset["timestamp"].iloc[-1]), + "mean_value": round(mean_val, 4), + "min_value": round(float(vals.min()), 4), + "max_value": round(float(vals.max()), 4), + "slope_per_day": round(slope, 6), + "direction": direction, + "r_squared": round(r2, 4), + } + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + + +def main() -> None: + """CLI entry point — runs the FastMCP stdio JSON-RPC loop.""" + mcp.run() + + +if __name__ == "__main__": + main() diff --git a/src/servers/smart_grid/wo/__init__.py b/src/servers/smart_grid/wo/__init__.py new file mode 100644 index 000000000..e69de29bb diff --git a/src/servers/smart_grid/wo/main.py b/src/servers/smart_grid/wo/main.py new file mode 100644 index 000000000..00a8641b5 --- /dev/null +++ b/src/servers/smart_grid/wo/main.py @@ -0,0 +1,346 @@ +"""WO MCP server — Work Order creation and management for Smart Grid transformers. + +WO = Work Order. After an agent diagnoses a fault (FMSR) and forecasts +remaining life (TSFM), it creates a work order here to schedule maintenance. + +Work orders are stored in-memory during a session. In production this would +write to a CMMS (Computerised Maintenance Management System) database. +Historical fault records from the dataset are pre-loaded as a read-only +reference; new work orders created by the agent are tracked separately. + +Tools exposed to the LLM agent: + list_fault_records — browse historical fault events from the dataset + get_fault_record — retrieve one historical fault event + create_work_order — create a new maintenance work order + list_work_orders — list agent-created work orders (this session) + update_work_order — update priority, status, or assignee + estimate_downtime — estimate repair downtime based on fault type / severity + +Data source: ``$SG_DATA_DIR/fault_records.csv`` (read-only history) + +in-memory dict (agent-created work orders, session-scoped). +""" + +from __future__ import annotations + +import uuid +from datetime import UTC, datetime + +import pandas as pd +from mcp.server.fastmcp import FastMCP + +from servers.smart_grid.base import ( + json_safe_record, + load_asset_metadata, + load_fault_records, +) + +mcp = FastMCP("smart-grid-wo") + +_fault_records: pd.DataFrame | None = None +_asset_metadata: pd.DataFrame | None = None + +# Session-scoped work order store: {wo_id: dict} +_work_orders: dict[str, dict] = {} + +# Downtime estimates (hours) by fault severity — derived from dataset statistics. +_DOWNTIME_ESTIMATES = { + "low": {"min": 2, "max": 6, "typical": 4}, + "medium": {"min": 6, "max": 16, "typical": 8}, + "high": {"min": 16, "max": 48, "typical": 24}, + "critical": {"min": 48, "max": 120, "typical": 72}, +} + +_VALID_PRIORITIES = {"low", "medium", "high", "critical"} +_VALID_STATUSES = {"open", "in_progress", "resolved", "closed"} + + +def _normalize_priority(priority: str | None) -> str | None: + if priority is None: + return None + return priority.strip().lower() + + +def _normalize_status(status: str | None) -> str | None: + if status is None: + return None + return status.strip().lower() + + +def _get_fault_records() -> pd.DataFrame: + global _fault_records + if _fault_records is None: + _fault_records = load_fault_records() + return _fault_records + + +def _get_asset_metadata() -> pd.DataFrame: + global _asset_metadata + if _asset_metadata is None: + _asset_metadata = load_asset_metadata() + return _asset_metadata + + +# --------------------------------------------------------------------------- +# Tools +# --------------------------------------------------------------------------- + + +@mcp.tool() +def list_fault_records( + transformer_id: str | None = None, + fault_type: str | None = None, + maintenance_status: str | None = None, + limit: int = 20, +) -> list[dict]: + """Browse historical fault records from the Smart Grid dataset. + + These are read-only historical events, not agent-created work orders. + Use list_work_orders() to see work orders created in this session. + + Args: + transformer_id: Filter by asset, e.g. "T-018". Optional. + fault_type: Substring filter on fault type, e.g. "Transformer". + maintenance_status: Filter by status ("Scheduled", "Pending", "Completed"). + limit: Max records to return (default 20, max 100). + + Returns: + List of fault record dicts. + """ + df = _get_fault_records() + + if transformer_id: + df = df[df["transformer_id"] == transformer_id] + if fault_type: + df = df[df["fault_type"].str.contains(fault_type, case=False, na=False)] + if maintenance_status: + df = df[ + df["maintenance_status"].fillna("").str.lower() + == maintenance_status.lower() + ] + + records = df.head(min(limit, 100)).to_dict(orient="records") + return [json_safe_record(record) for record in records] + + +@mcp.tool() +def get_fault_record(fault_id: str) -> dict: + """Retrieve a single historical fault record by its ID. + + Args: + fault_id: e.g. "F001". Use list_fault_records() to discover IDs. + + Returns: + Fault record dict, or an error dict if not found. + """ + df = _get_fault_records() + row = df[df["fault_id"] == fault_id] + if row.empty: + return {"error": f"Fault record '{fault_id}' not found."} + return json_safe_record(row.iloc[0].to_dict()) + + +@mcp.tool() +def create_work_order( + transformer_id: str, + issue_description: str, + priority: str = "medium", + fault_type: str | None = None, + estimated_downtime_hours: float | None = None, +) -> dict: + """Create a new maintenance work order for a transformer. + + Args: + transformer_id: Asset requiring maintenance, e.g. "T-016". + issue_description: Plain-language description of the problem. + priority: One of "low", "medium", "high", "critical". + Defaults to "medium". + fault_type: Optional fault classification for tracking, + e.g. "Arc Discharge" or "Thermal Fault T3". + estimated_downtime_hours: Override the auto-estimated downtime. + + Returns: + Dict with keys: work_order_id, transformer_id, issue_description, + priority, fault_type, status, estimated_downtime_hours, + created_at, assigned_technician (null until assigned). + """ + priority = _normalize_priority(priority) or "medium" + + if priority not in _VALID_PRIORITIES: + return { + "error": f"Invalid priority '{priority}'. " + f"Must be one of: {sorted(_VALID_PRIORITIES)}" + } + + try: + metadata = _get_asset_metadata() + except FileNotFoundError as exc: + return {"error": str(exc)} + + if transformer_id not in set(metadata["transformer_id"]): + return { + "error": f"Unknown transformer_id '{transformer_id}'.", + "valid_transformer_id_source": "$SG_DATA_DIR/asset_metadata.csv", + } + + if estimated_downtime_hours is None: + est = _DOWNTIME_ESTIMATES.get(priority, _DOWNTIME_ESTIMATES["medium"]) + estimated_downtime_hours = est["typical"] + + wo_id = f"WO-{uuid.uuid4().hex[:8].upper()}" + wo = { + "work_order_id": wo_id, + "transformer_id": transformer_id, + "issue_description": issue_description, + "priority": priority, + "fault_type": fault_type, + "status": "open", + "estimated_downtime_hours": estimated_downtime_hours, + "created_at": datetime.now(UTC).isoformat().replace("+00:00", "Z"), + "assigned_technician": None, + "notes": [], + } + _work_orders[wo_id] = wo + return wo + + +@mcp.tool() +def list_work_orders( + transformer_id: str | None = None, + status: str | None = None, + priority: str | None = None, +) -> list[dict]: + """List work orders created by the agent in this session. + + Args: + transformer_id: Filter by asset. Optional. + status: Filter by status ("open", "in_progress", "resolved", "closed"). + priority: Filter by priority ("low", "medium", "high", "critical"). + + Returns: + List of work order dicts, newest first. + """ + wos = list(_work_orders.values()) + + if transformer_id: + wos = [w for w in wos if w["transformer_id"] == transformer_id] + status = _normalize_status(status) + priority = _normalize_priority(priority) + + if status: + wos = [w for w in wos if w["status"] == status] + if priority: + wos = [w for w in wos if w["priority"] == priority] + + return sorted(wos, key=lambda w: w["created_at"], reverse=True) + + +@mcp.tool() +def update_work_order( + work_order_id: str, + status: str | None = None, + priority: str | None = None, + assigned_technician: str | None = None, + note: str | None = None, +) -> dict: + """Update an existing work order's status, priority, assignee, or add a note. + + Args: + work_order_id: ID returned by create_work_order(), e.g. "WO-A1B2C3D4". + status: New status: "open", "in_progress", "resolved", "closed". + priority: New priority: "low", "medium", "high", "critical". + assigned_technician: Technician identifier, e.g. "TEC-03". + note: Free-text note to append to the work order log. + + Returns: + Updated work order dict, or an error dict if not found / invalid. + """ + if work_order_id not in _work_orders: + return {"error": f"Work order '{work_order_id}' not found in this session."} + + wo = _work_orders[work_order_id] + + status = _normalize_status(status) + priority = _normalize_priority(priority) + + if status is not None: + if status not in _VALID_STATUSES: + return { + "error": f"Invalid status '{status}'. " + f"Must be one of: {sorted(_VALID_STATUSES)}" + } + wo["status"] = status + + if priority is not None: + if priority not in _VALID_PRIORITIES: + return { + "error": f"Invalid priority '{priority}'. " + f"Must be one of: {sorted(_VALID_PRIORITIES)}" + } + wo["priority"] = priority + + if assigned_technician is not None: + wo["assigned_technician"] = assigned_technician + + if note: + wo["notes"].append( + { + "timestamp": datetime.now(UTC).isoformat().replace("+00:00", "Z"), + "text": note, + } + ) + + return wo + + +@mcp.tool() +def estimate_downtime( + transformer_id: str, + severity: str, + fault_type: str | None = None, +) -> dict: + """Estimate the expected downtime (hours) for maintenance on a transformer. + + Estimates are derived from the Smart Grid Fault Records dataset statistics. + The full project may replace this with a learned model. + + Args: + transformer_id: Asset requiring maintenance, e.g. "T-019". + severity: Fault severity: "low", "medium", "high", or "critical". + fault_type: Optional fault type for context (not used in baseline + estimate but recorded for traceability). + + Returns: + Dict with keys: transformer_id, severity, fault_type, + estimated_min_hours, estimated_max_hours, estimated_typical_hours, + source. + """ + if severity not in _VALID_PRIORITIES: + return { + "error": f"Invalid severity '{severity}'. " + f"Must be one of: {sorted(_VALID_PRIORITIES)}" + } + + est = _DOWNTIME_ESTIMATES[severity] + return { + "transformer_id": transformer_id, + "severity": severity, + "fault_type": fault_type, + "estimated_min_hours": est["min"], + "estimated_max_hours": est["max"], + "estimated_typical_hours": est["typical"], + "source": "dataset_statistics_baseline", + } + + +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + + +def main() -> None: + """CLI entry point — runs the FastMCP stdio JSON-RPC loop.""" + mcp.run() + + +if __name__ == "__main__": + main()