Skip to content

feat(migration-to-aws): v2 with 6-phase workflow, AI workload detection, and billing discovery#85

Open
icarthick wants to merge 8 commits intoawslabs:mainfrom
icarthick:migration-to-aws-v2
Open

feat(migration-to-aws): v2 with 6-phase workflow, AI workload detection, and billing discovery#85
icarthick wants to merge 8 commits intoawslabs:mainfrom
icarthick:migration-to-aws-v2

Conversation

@icarthick
Copy link
Contributor

@icarthick icarthick commented Mar 9, 2026

Summary

Rewrites the migration-to-aws plugin from a 4-phase to a 6-phase workflow with three parallel discovery paths.

  • 6-phase workflow: discover → clarify → design → estimate → generate → feedback (v1 had discover → clarify → estimate → execute)
  • AI workload detection: App-code discovery scans for Gemini, OpenAI, Vertex AI, and traditional ML frameworks (TensorFlow, PyTorch), with source-specific model mapping tables (ai-gemini-to-bedrock, ai-openai-to-bedrock)
  • Billing discovery path: Analyzes GCP billing exports as a fallback when no Terraform or app code is available
  • Adaptive clarify phase: Category-based questions (global, compute, database, AI, AI-only) that activate based on discover findings
  • Separate design phase: Dedicated design-infra, design-ai, and design-billing references with confidence levels
  • Generate phase: Produces Terraform configs, AI provider adapters, Bedrock setup scripts, comparison test harnesses, billing artifacts, and migration documentation
  • Feedback phase: Anonymized telemetry traces capturing phase timings and complexity metrics (no PII)
  • Pricing cache: Validated rates with confidence levels; awspricing MCP used for real-time validation

Changes

  • 52 files changed (+7,697 / -360)
  • 37 new files: phase references (clarify, design, estimate, generate, feedback), design-refs (ai-gemini-to-bedrock, ai-openai-to-bedrock), shared schemas, pricing cache
  • 15 modified files: SKILL.md, plugin.json, README, existing design-refs and discover/clustering references
  • All changes scoped to plugins/migration-to-aws/

How to test

# Install plugin locally
claude --plugin-dir ./plugins/migration-to-aws

# Test against a sample GCP Terraform project
# The skill triggers on: "migrate from GCP", "GCP to AWS", etc.

# Verify build
mise run build

Build status

  • Markdown lint: pass
  • Manifest validation: pass
  • Cross-reference validation: pass
  • Formatting (dprint): pass
  • Semgrep: informational findings only (detects "OpenAI" keyword in migration reference docs — expected)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

🤖 Generated with Claude Code

@icarthick icarthick requested review from a team as code owners March 9, 2026 19:25
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep OSS found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@icarthick icarthick force-pushed the migration-to-aws-v2 branch 2 times, most recently from dc47820 to e4f3da0 Compare March 10, 2026 01:48
…on, and billing discovery

Rewrite the migration-to-aws plugin from a 4-phase to a 6-phase workflow
(discover → clarify → design → estimate → generate → feedback) with three
parallel discovery paths: infrastructure, application code, and billing.

Key changes:

Discover phase:
- Add app-code discovery path scanning source for AI/ML frameworks
  (Gemini, Vertex AI, OpenAI, traditional ML like TensorFlow/PyTorch)
- Add billing discovery path with GCP billing export analysis
- Enhance IaC discovery with improved Terraform resource clustering
  using typed-edge strategy and classification rules

Clarify phase:
- Implement adaptive category-based questioning (global, compute,
  database, AI, AI-only) that activates based on discover findings
- Skip categories when discover already provides sufficient signal

Design phase (new):
- Separate design from discover with dedicated design-infra, design-ai,
  and design-billing reference documents
- Source-specific AI model mapping via ai-gemini-to-bedrock and
  ai-openai-to-bedrock reference tables

Estimate phase:
- Split into estimate-infra, estimate-ai, and estimate-billing
- Add pricing-cache with validated rates and confidence levels
- Use awspricing MCP server for real-time price validation

Generate phase:
- Produce Terraform configurations from templates (main.tf, variables.tf)
- Generate AI provider adapter (provider_adapter.py) for SDK migration
- Generate Bedrock setup scripts and comparison test harnesses
- Add billing artifact generation and documentation artifacts
- Structured artifact specs for infra, AI, billing, docs, and scripts

Feedback phase (new):
- Anonymized telemetry trace capturing phase timings, confidence
  scores, and migration complexity metrics
- No PII or source code in traces

Supporting changes:
- Add JSON schemas for discover-ai, discover-billing, discover-iac,
  estimate-infra, and phase-status data structures
- Update plugin.json version and README
- Enhance design-refs with confidence levels, factual corrections,
  and improved service mapping tables

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@icarthick icarthick force-pushed the migration-to-aws-v2 branch from e4f3da0 to c66870d Compare March 10, 2026 17:27
@scottschreckengaust
Copy link
Member

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@krokoko
Copy link
Contributor

krokoko commented Mar 11, 2026

Now its done, but for subsequent PRs, it would be great to break it down in smaller ones :)

57 files, ~7,600 net additions. This PR bundles:

  • AI workload detection (a whole new discovery path)
  • Billing discovery (another new discovery path)
  • The generate phase (Terraform code gen, scripts, AI adapters, docs)
  • Feedback/telemetry system
  • Pricing cache overhaul

Each of these is independently shippable and reviewable.

@krokoko
Copy link
Contributor

krokoko commented Mar 11, 2026

Question: pricing-cache.md is hardcoded with March 2026 rates. AI model pricing changes frequently (new models every few months, price cuts, deprecations). There is no:

  • Staleness indicator beyond a manual "Last updated" header
  • CI check that flags the cache as stale after N days
  • Automated mechanism to refresh prices
    Do we have a plan for this ?

@krokoko
Copy link
Contributor

krokoko commented Mar 11, 2026

Q: The PR adds schemas defined in markdown (schema-discover-ai.md, schema-discover-billing.md, schema-discover-iac.md, schema-estimate-infra.md, schema-phase-status.md) but these are prose descriptions, not machine-validatable JSON Schemas. The repo already has schemas/ with proper .schema.json files for plugins and marketplace.

For a 6-phase workflow producing 10+ JSON artifacts, schema drift between phases is a real risk. Shoudl we have something like JSON Schemas in schemas/ with CI validation (like the existing mise run lint:manifests) ?

@krokoko
Copy link
Contributor

krokoko commented Mar 11, 2026

Could you provide me with example inputs and expected outputs? Like 3-4 test scenarios: (1) Terraform-only, (2) AI-only, (3) billing-only, (4) all three. I would like to see the expected artifact tree for each.

When Terraform is present, billing data is supplementary — only service-level costs and AI signal detection are needed. Extract via a script to avoid reading the raw file into context.

1. Use Bash to read only the **first line** of the billing file to identify column headers.
2. Write a script to `$MIGRATION_DIR/_extract_billing.py` (or `.js` / shell — use whatever runtime is available) that:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path triggers when both Terraform files and billing files are present. I have some concerns here since the python script is LLM-generated. The skill instructions tell the LLM to write a script that parses the billing file. The exact code varies per invocation, it depends on:

  • The column headers the LLM reads from line 1
  • Which runtime the LLM decides to use (Python, Node, shell)
  • The LLM's interpretation of the instructions
    This means there's no fixed, reviewable, testable artifact. Two runs of the same skill on the same billing file could produce different scripts. If the script has a bug (e.g., wrong column index, float parsing error), there's no way to reproduce it because the script is deleted in step 5.

The LLM reads the first line of the CSV (column headers) and uses those headers to write the script. If the billing file has unusual or adversarial column names, the LLM might generate a script that behaves unexpectedly.

Step 3 says: "try python3 first. If not found, try python. If neither available, delete the script and fall back to loading discover-billing.md." This means the skill might generate Python code, fail to execute it, then fall back to having the LLM read the file directly (the thing it was trying to avoid). The fallback silently changes the behavior and context cost.

Ans as hinted above: Step 5 says "Delete the script file after successful execution." This means:

  • If the output is wrong, you can't inspect what script produced it
  • If there's a security incident, the executed code is gone
  • You can't diff the script between runs to understand behavioral changes

I would have either:

  • have a versioned fixed extraction script, for instance: Add a tools/extract-billing.py (or similar) to the repo. It takes a billing CSV path as input, outputs the billing-profile.json schema to stdout. The skill instructions say "run tools/extract-billing.py " instead of "write a script." This is reviewable, testable, and deterministic.
  • or a simpler heuristic without code generation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree a script here would work well as opposed to instructions. For complex billing data, LLM automatically resorts to python script generation, regardless of whether i have instructions to create python script or not. Billing data invariably requires some kind of math, which i feel simple instructions might not be able to capture

@icarthick
Copy link
Contributor Author

Q: The PR adds schemas defined in markdown (schema-discover-ai.md, schema-discover-billing.md, schema-discover-iac.md, schema-estimate-infra.md, schema-phase-status.md) but these are prose descriptions, not machine-validatable JSON Schemas. The repo already has schemas/ with proper .schema.json files for plugins and marketplace.

For a 6-phase workflow producing 10+ JSON artifacts, schema drift between phases is a real risk. Shoudl we have something like JSON Schemas in schemas/ with CI validation (like the existing mise run lint:manifests) ?

This was my initial thought, but then i chose md files only because of portability. Kiro Power for example cannot deal with json files. Having it in md allows me to use the same files across kiro power and claude plugin

@icarthick
Copy link
Contributor Author

Question: pricing-cache.md is hardcoded with March 2026 rates. AI model pricing changes frequently (new models every few months, price cuts, deprecations). There is no:

  • Staleness indicator beyond a manual "Last updated" header
  • CI check that flags the cache as stale after N days
  • Automated mechanism to refresh prices
    Do we have a plan for this ?

We intend to create an mcp server which will handle the staleness (fetching real time prices). This can be done in upcoming releases

@scottschreckengaust
Copy link
Member

PR #85 Review Results:

  ┌────────────┬───────┐                                                                                                                                                                                                    
  │  Severity  │ Count │                                                                                                                                                                                                  
  ├────────────┼───────┤                       
  │ Critical   │ 8     │
  ├────────────┼───────┤
  │ Important  │ 13    │
  ├────────────┼───────┤
  │ Suggestion │ 13    │
  └────────────┴───────┘

Top critical findings across both tools:

  1. Stale output-schema.md -- entire file is a v1 artifact with wrong schemas (should be deleted)
  2. estimation.json reference in SKILL.md error table (old unsplit name)
  3. "plan execution" in SKILL.md frontmatter (should be "generate")
  4. Hardcoded AWS_REGION: us-east-1 in .mcp.json (ignores user's target region)
  5. Pricing cache accuracy never disclosed to users in normal flow
  6. MCP fallback indistinguishable from normal cache usage
  7. Circular GCP baseline (AWS x 1.25 fallback)
  8. State machine incomplete for feedback resolution

The code-review skill posted a "No issues found" comment on the PR (its confidence threshold of 80 filtered out all findings), while the pr-review-toolkit agents identified the issues above at lower confidence but with
solid supporting evidence.

PR-review-85-72cbc439.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this based on semantic versioning.

@icarthick
Copy link
Contributor Author

Now its done, but for subsequent PRs, it would be great to break it down in smaller ones :)

57 files, ~7,600 net additions. This PR bundles:

  • AI workload detection (a whole new discovery path)
  • Billing discovery (another new discovery path)
  • The generate phase (Terraform code gen, scripts, AI adapters, docs)
  • Feedback/telemetry system
  • Pricing cache overhaul

Each of these is independently shippable and reviewable

Could you provide me with example inputs and expected outputs? Like 3-4 test scenarios: (1) Terraform-only, (2) AI-only, (3) billing-only, (4) all three. I would like to see the expected artifact tree for each.

Just to be clear. The plugin can work off of three different kinds of source artifacts

  • Terraform files
  • GCP billing data (json or csv)
  • Application code (py, ts or go)

The presence of each one of those artifacts will produce any combination of these files (i might have missed a few more)

  • aws-design.json (design phase output)
  • billing-profile.json (discover phase, when billing data is present)
  • estimation-infra.json (estimation phase output)
  • gcp-resource-clusters.json (discover phase output)
  • gcp-resource-inventory.json (discover phase output)
  • generation-infra.json (generate phase output)
  • preferences.json (clarify phase output)

Additionally it also generates terraform/scripts that represent migration artifacts needed for aws migration

generation-infra.json
estimation-infra.json
aws-design.json
preferences.json
billing-profile.json
gcp-resource-clusters.json
gcp-resource-inventory.json

icarthick and others added 2 commits March 11, 2026 14:31
- C1: Delete stale output-schema.md (v1 artifact with wrong schemas)
- C2: Fix estimation.json reference → estimation-*.json in SKILL.md
- C3: Fix "plan execution" → "generate migration artifacts" in frontmatter
- C4: Document AWS_REGION env var as MCP default; pass target region per-query
- C5: Add mandatory pricing accuracy disclosure in all estimate summaries
- C6: Distinguish pricing sources: cached, live, cached_fallback, unavailable
- C7: Replace circular GCP baseline (AWS×1.25) with user prompt
- C8: Add feedback auto-close to state machine table after generate_done

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Breaking changes from v1: new phase-status schema, renamed output
files (preferences.json, estimation-*.json, generation-*.json),
added generate and feedback phases.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@icarthick
Copy link
Contributor Author

icarthick commented Mar 11, 2026

plugins/migration-to-aws/.claude-plugin/plugin.json

PR #85 Review Results:

  ┌────────────┬───────┐                                                                                                                                                                                                    
  │  Severity  │ Count │                                                                                                                                                                                                  
  ├────────────┼───────┤                       
  │ Critical   │ 8     │
  ├────────────┼───────┤
  │ Important  │ 13    │
  ├────────────┼───────┤
  │ Suggestion │ 13    │
  └────────────┴───────┘

Top critical findings across both tools:

  1. Stale output-schema.md -- entire file is a v1 artifact with wrong schemas (should be deleted)
  2. estimation.json reference in SKILL.md error table (old unsplit name)
  3. "plan execution" in SKILL.md frontmatter (should be "generate")
  4. Hardcoded AWS_REGION: us-east-1 in .mcp.json (ignores user's target region)
  5. Pricing cache accuracy never disclosed to users in normal flow
  6. MCP fallback indistinguishable from normal cache usage
  7. Circular GCP baseline (AWS x 1.25 fallback)
  8. State machine incomplete for feedback resolution

The code-review skill posted a "No issues found" comment on the PR (its confidence threshold of 80 filtered out all findings), while the pr-review-toolkit agents identified the issues above at lower confidence but with solid supporting evidence.

PR-review-85-72cbc439.md

@scottschreckengaust All the critical issues and version bump have been addressed

@scottschreckengaust
Copy link
Member

Code review

Found 2 issues:

  1. generate-infra.md references non-existent "Step 4" in generate-artifacts-scripts.md. The cross-reference says "see resource detection rules in generate-artifacts-scripts.md Step 4" but that file only has Steps 1-3. The correct reference is Step 1 ("Detect Resource Categories"), which defines the has_databases, has_storage, etc. flags.

**Include this phase ONLY if `aws-design.json` contains database or storage resources
(see resource detection rules in generate-artifacts-scripts.md Step 4).**
**If no data migration is needed, compress the timeline: move Cutover to Weeks 8-9
and Validation to Week 10. Reduce `total_weeks` accordingly.**

  1. discover-billing.md and schema-discover-billing.md define conflicting schemas for billing-profile.json. The agent is told to generate output per one structure then validate against the other. Specific conflicts: (a) ai_signals is an array in discover-billing.md L98 but an object in the schema L79, (b) service field is "service" in discover-billing L84 vs "gcp_service" in schema L28, (c) SKU field is "sku" in discover-billing L89 vs "sku_description" in schema L34, (d) discover-billing includes an "ai_detection" top-level object (L106) absent from the schema. Line 114 of discover-billing.md explicitly instructs: "Load references/shared/schema-discover-billing.md and validate the output against the billing-profile.json schema" — this validation will always fail.

"services": [
{
"service": "Cloud Run",
"gcp_service_type": "google_cloud_run_service",
"monthly_cost": 450.00,
"top_skus": [
{
"sku": "Cloud Run - CPU Allocation Time",
"monthly_cost": 320.00,
"usage_amount": 1500,
"usage_unit": "vCPU-seconds"
}
],
"usage_pattern": "consistent"
}
],
"ai_signals": [
{
"pattern": "3.1",
"service_description": "Vertex AI",
"monthly_cost": 200.00,
"confidence": 0.98
}
],
"ai_detection": {
"has_ai_workload": false,
"confidence": 0,
"ai_monthly_spend": 0.00
}
}
```
Load `references/shared/schema-discover-billing.md` and validate the output against the `billing-profile.json` schema.

"services": [
{
"gcp_service": "Cloud Run",
"gcp_service_type": "google_cloud_run_service",
"monthly_cost": 450.00,
"percentage_of_total": 0.18,
"top_skus": [
{
"sku_description": "Cloud Run - CPU Allocation Time",
"monthly_cost": 300.00
},
{
"sku_description": "Cloud Run - Memory Allocation Time",
"monthly_cost": 150.00
}
],
"ai_signals": []
},
{
"gcp_service": "Cloud SQL",
"gcp_service_type": "google_sql_database_instance",
"monthly_cost": 800.00,
"percentage_of_total": 0.33,
"top_skus": [
{
"sku_description": "Cloud SQL for PostgreSQL - DB custom CORE",
"monthly_cost": 500.00
},
{
"sku_description": "Cloud SQL for PostgreSQL - DB custom RAM",
"monthly_cost": 300.00
}
],
"ai_signals": []
},
{
"gcp_service": "Vertex AI",
"gcp_service_type": "google_vertex_ai_endpoint",
"monthly_cost": 600.00,
"percentage_of_total": 0.24,
"top_skus": [
{
"sku_description": "Vertex AI Prediction - Online Prediction",
"monthly_cost": 400.00
},
{
"sku_description": "Generative AI - Gemini Pro Input Tokens",
"monthly_cost": 200.00
}
],
"ai_signals": ["vertex_ai", "generative_ai"]
}
],
"ai_signals": {
"detected": true,
"confidence": 0.85,
"services": ["Vertex AI"]
}
}

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

…example

Fix generate-infra.md Step 4 cross-reference (should be Step 1) and
update discover-billing.md output example to match the source-of-truth
schema in schema-discover-billing.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@icarthick
Copy link
Contributor Author

Code review

Found 2 issues:

  1. generate-infra.md references non-existent "Step 4" in generate-artifacts-scripts.md. The cross-reference says "see resource detection rules in generate-artifacts-scripts.md Step 4" but that file only has Steps 1-3. The correct reference is Step 1 ("Detect Resource Categories"), which defines the has_databases, has_storage, etc. flags.

**Include this phase ONLY if `aws-design.json` contains database or storage resources
(see resource detection rules in generate-artifacts-scripts.md Step 4).**
**If no data migration is needed, compress the timeline: move Cutover to Weeks 8-9
and Validation to Week 10. Reduce `total_weeks` accordingly.**

  1. discover-billing.md and schema-discover-billing.md define conflicting schemas for billing-profile.json. The agent is told to generate output per one structure then validate against the other. Specific conflicts: (a) ai_signals is an array in discover-billing.md L98 but an object in the schema L79, (b) service field is "service" in discover-billing L84 vs "gcp_service" in schema L28, (c) SKU field is "sku" in discover-billing L89 vs "sku_description" in schema L34, (d) discover-billing includes an "ai_detection" top-level object (L106) absent from the schema. Line 114 of discover-billing.md explicitly instructs: "Load references/shared/schema-discover-billing.md and validate the output against the billing-profile.json schema" — this validation will always fail.

"services": [
{
"service": "Cloud Run",
"gcp_service_type": "google_cloud_run_service",
"monthly_cost": 450.00,
"top_skus": [
{
"sku": "Cloud Run - CPU Allocation Time",
"monthly_cost": 320.00,
"usage_amount": 1500,
"usage_unit": "vCPU-seconds"
}
],
"usage_pattern": "consistent"
}
],
"ai_signals": [
{
"pattern": "3.1",
"service_description": "Vertex AI",
"monthly_cost": 200.00,
"confidence": 0.98
}
],
"ai_detection": {
"has_ai_workload": false,
"confidence": 0,
"ai_monthly_spend": 0.00
}
}
```
Load `references/shared/schema-discover-billing.md` and validate the output against the `billing-profile.json` schema.

"services": [
{
"gcp_service": "Cloud Run",
"gcp_service_type": "google_cloud_run_service",
"monthly_cost": 450.00,
"percentage_of_total": 0.18,
"top_skus": [
{
"sku_description": "Cloud Run - CPU Allocation Time",
"monthly_cost": 300.00
},
{
"sku_description": "Cloud Run - Memory Allocation Time",
"monthly_cost": 150.00
}
],
"ai_signals": []
},
{
"gcp_service": "Cloud SQL",
"gcp_service_type": "google_sql_database_instance",
"monthly_cost": 800.00,
"percentage_of_total": 0.33,
"top_skus": [
{
"sku_description": "Cloud SQL for PostgreSQL - DB custom CORE",
"monthly_cost": 500.00
},
{
"sku_description": "Cloud SQL for PostgreSQL - DB custom RAM",
"monthly_cost": 300.00
}
],
"ai_signals": []
},
{
"gcp_service": "Vertex AI",
"gcp_service_type": "google_vertex_ai_endpoint",
"monthly_cost": 600.00,
"percentage_of_total": 0.24,
"top_skus": [
{
"sku_description": "Vertex AI Prediction - Online Prediction",
"monthly_cost": 400.00
},
{
"sku_description": "Generative AI - Gemini Pro Input Tokens",
"monthly_cost": 200.00
}
],
"ai_signals": ["vertex_ai", "generative_ai"]
}
],
"ai_signals": {
"detected": true,
"confidence": 0.85,
"services": ["Vertex AI"]
}
}

🤖 Generated with Claude Code

  • If this code review was useful, please react with 👍. Otherwise, react with 👎.

@scottschreckengaust Both the issues have been addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants