Skip to content

Harden AWS resource rendering and demo workflow#180

Merged
yimsk merged 16 commits into
mainfrom
develop
May 1, 2026
Merged

Harden AWS resource rendering and demo workflow#180
yimsk merged 16 commits into
mainfrom
develop

Conversation

@yimsk
Copy link
Copy Markdown
Contributor

@yimsk yimsk commented May 1, 2026

Summary

  • Redact sensitive AI raw resource data and correct persisted AI chat session docs.
  • Preserve enrichment failure state for S3/IAM resources and fix EventBridge bus-qualified rule handling.
  • Stabilize pagination/concurrency edge cases and run VHS demos with host-installed VHS plus pinned LocalStack.

Validation

  • task --list
  • Ruby YAML parse for .github/workflows/integration-test.yml
  • nix flake check
  • nix develop --command vhs --version
  • vhs validate 'docs/tapes/*.tape'
  • go build ./cmd/claws
  • go test ./...
  • go vet ./...
  • golangci-lint run --config .golangci.yml ./...
  • gofmt -l cmd internal custom scripts
  • git diff --check
  • task test:vhs

Notes

  • PR is from develop to main.
  • No co-author or generated attribution trailers were added.

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Code Review: PR #180 — Harden AWS resource rendering and demo workflow

Good set of changes overall. The enrichment-status pattern, sensitive data redaction, and EventBridge fix are all well-motivated. A few items worth discussing before merge.


Summary

This PR does four distinct things:

  1. Enrichment failure state — IAM (users/roles/policies) and S3 resources now track why optional data is missing (access denied vs. fetch failed vs. not configured) and surface that in the detail view instead of silently omitting it.
  2. Sensitive data redactionformatResourceDetail now redacts keys like Environment, Variables, secret*, token, etc. before handing raw data to the AI chat session.
  3. EventBridge bus-qualified rule IDs — Rules on custom buses now encode the bus name in their ID (bus/rule), fixing Get/Delete for non-default buses.
  4. Demo/CI workflow — VHS moves from a Docker container to a host-installed binary (via Nix); LocalStack image is pinned via env var.

Issues

🔴 Duplicate type definitions across packages

EnrichmentStatus (and its constants + helpers) is redeclared four times:

  • custom/iam/policies/dao.go
  • custom/iam/roles/dao.go
  • custom/iam/users/dao.go
  • custom/s3/buckets/dao.go

Similarly, isEnrichmentFailure() and enrichmentStatusDisplay() are copy-pasted into each of the four corresponding render.go files. Any future change (e.g. a new status value) must be applied in four places. This should be extracted to a shared internal/dao or internal/enrichment package before the pattern spreads further.

Note: S3's variant adds EnrichmentConfigured/EnrichmentNotConfigured but omits EnrichmentFetched; the IAM variant has the inverse. This divergence will cause confusion if these types ever need to interoperate.

🟡 isSensitiveRawKey — substring matching causes false positives

if normalized == sensitive || strings.Contains(normalized, sensitive) {
    return true
}

"token" is in the sensitive list, which will also match "tokenization", "tokencount", "accesstokens_issued", etc. Same for "secret" matching "secretsmanager_arn", "credential" matching "credentialsexpiry", and so on. Consider exact-match only for short words, or expand to a suffix/prefix strategy.

🟡 Unpinned GitHub Action

- uses: cachix/install-nix-action@v31

A moving tag is a supply-chain risk; best practice is to pin to a commit SHA, e.g.:

- uses: cachix/install-nix-action@v31  # <commit-sha>

🟡 S3 render.go — legacy default branch in Versioning section

default:
    if b.Versioning != "" {
        d.Field("Status", b.Versioning)
    } else {
        d.Field("Status", enrichmentStatusDisplay(EnrichmentUnknown))
    }

Resources fetched via Get() will always have VersioningStatus set, so the non-empty b.Versioning branch inside default is unreachable. If the intent is to keep backward compat for resources from List() (where status is never set), a comment explaining that would help. Otherwise this dead branch is confusing.


Minor observations

  • isSensitiveRawKey performance: The function iterates a []string slice on every key in every map node. For large payloads this is O(keys × sensitiveWords). A map[string]struct{} for exact matches and a short exact list for substring checks would be faster and clearer.

  • hasSensitiveLabelField double-traversal: Every record object is walked twice — once to detect if it has a sensitive label key, and again in the main loop. For deeply nested structures this is benign, but combining the two passes would simplify the logic.

  • Taskfile env block repetition: The six demo:record:* tasks each copy the same seven env vars. A YAML anchor or a shared Taskfile include would reduce drift risk (e.g., if AWS_CONFIG_FILE path changes).


Positive highlights

  • parseRuleID correctly uses LastIndex to handle partner bus ARNs (aws.partner/example.com/account/rule-name). The test table covers this case explicitly — good.
  • The pagination fix in resource_browser_fetch.go is minimal and correct: only iterate profile/region pairs that have a next-page token when continuing pagination.
  • Semaphore in tag_search_view.go plugs a missing concurrency cap — consistent with how other fetches are bounded.
  • The session persistence test (TestSessionPersistenceDoesNotContainRedactedSecrets) is a strong regression guard: it writes a real file and checks the content, not just the in-memory state.
  • SageMaker nil-pointer guard (ch.DataSource.S3DataSource.S3Uri != nil) is a straightforward correctness fix.

Suggested blockers before merge

  1. Extract EnrichmentStatus + helpers to a shared package (or at minimum consolidate the IAM and S3 variants).
  2. Fix the strings.Contains false-positive risk in isSensitiveRawKey.

The rest are suggestions rather than blockers.

@yimsk
Copy link
Copy Markdown
Contributor Author

yimsk commented May 1, 2026

Addressed the Claude review blockers in ea05c51:

  • Extracted shared enrichment status/type helpers into internal/enrichment and updated IAM/S3 callers to use it.
  • Reworked sensitive raw-key detection to avoid broad substring matches such as tokenization, tokencount, accessTokens_issued, secretsmanager_arn, and credentialsexpiry; added regression coverage.
  • Pinned cachix/install-nix-action to the resolved v31 commit SHA.
  • Added an explanatory comment for the S3 versioning legacy/default branch.

Validation run locally:

  • go test ./internal/ai ./internal/enrichment ./custom/s3/buckets ./custom/iam/users ./custom/iam/roles ./custom/iam/policies
  • workflow YAML parse via Ruby
  • go test ./...
  • go vet ./...
  • go build ./cmd/claws
  • golangci-lint run --config .golangci.yml ./...
  • gofmt -l cmd internal custom scripts
  • git diff --check

@claude
Copy link
Copy Markdown

claude Bot commented May 1, 2026

Code Review: PR #180 — Harden AWS resource rendering and demo workflow

Overall this is a solid, well-scoped PR. The enrichment-status pattern is clean, the AI redaction logic is reasonably thorough, and the EventBridge fix handles the partner-event-bus edge case correctly. Tests cover the new behaviour well. A few specific issues are worth addressing before merging.


Summary of Changes

Area What changed
CI / demo workflow Replace Docker VHS with host-installed VHS via Nix; pin LocalStack image via env var
EventBridge rules Composite eventBusName/ruleName IDs in Get, Delete, List
IAM / S3 enrichment New internal/enrichment.Status type; propagate failure reason to renderer
AI raw-data redaction redactSensitiveRaw applied before persisting/sending resource detail
Pagination / concurrency Skip profile–region pairs without a next-token on page N+1; semaphore in tag search
SageMaker nil guard Check S3Uri != nil before dereferencing
Dependency bumps Charm TUI libs from RC → stable; AWS SDK v2 patches

Issues

1. environment / variables in exactSensitiveRawKeys will redact non-sensitive values (false positive)

File: internal/ai/tools.go

var exactSensitiveRawKeys = map[string]bool{
    "environment":          true,   // ← catches {"environment": "production"}
    "variables":            true,   // ← catches {"variables": "some-list"}
    ...
}

Any resource field named environment or variables — regardless of its value — will be replaced with [REDACTED]. For Lambda the intent is correct (the Environment object wraps secret variables), but EC2 instances, ECS tasks, and many other resources surface an environment tag or attribute that holds non-sensitive data (e.g. the string "production").

Consider matching "environmentvariables" (already in the list) and removing the broader "environment" / "variables" entries, or only redacting the nested Variables map rather than its parent container.


2. DBPassword-style all-caps prefixes escape the camelCase splitter (false negative)

File: internal/ai/tools.go, rawKeySegments

The splitter fires on upper(r) && (lower(prev) || digit(prev)), so DbPassword["db", "password"] (caught), but DBPassword["dbpassword"] (not caught, because B→P is upper→upper).

AWS SDK field names like DBPassword, DBMasterPassword, MasterUserPassword and similar are common in RDS resources. Add a test case and, if the cases matter, add an additional split rule for the upper(r) && upper(prev) && lower(next) transition (standard Go convention split for acronyms).


3. Inconsistent failure-status checks in custom/s3/buckets/render.go

The Versioning section uses a clean switch on status:

switch b.VersioningStatus {
case enrichment.Configured: ...
case enrichment.NotConfigured: ...
case enrichment.AccessDenied, enrichment.FetchFailed: ...

But Encryption and PublicAccessBlock use an ad-hoc if:

if b.EncryptionStatus == enrichment.AccessDenied || b.EncryptionStatus == enrichment.FetchFailed || b.EncryptionStatus == enrichment.Unknown {

The Unknown case here means a zero-value struct would display as "Unknown" rather than falling through to the b.EncryptionEnabled check. That's a behaviour difference from a fresh BucketResource{}. Use enrichment.IsFailure(b.EncryptionStatus) (already defined) to be consistent with the IAM renderers and avoid the Unknown catch-all.


4. Double-marshal in redactSensitiveRaw is unnecessary

File: internal/ai/tools.go

func redactSensitiveRaw(raw any) any {
    data, err := json.Marshal(raw)
    // then Unmarshal into any...
}

The round-trip is needed only if raw might be a typed struct. If the resource's Raw() method already returns map[string]any, the round-trip is a no-op overhead. It's not harmful in the AI path (latency dominated by the LLM call), but if raw really can be a typed struct, documenting that assumption or handling it with a type-switch would be cleaner. If it's always map[string]any, skip the round-trip.


5. Pagination fix double-if could be simplified

File: internal/view/resource_browser_fetch.go

if existingTokens != nil {
    if _, ok := existingTokens[key]; !ok {
        continue
    }
}
keys = append(keys, key)

Collapsing to a single condition makes the intent clearer:

if existingTokens != nil {
    if _, ok := existingTokens[key]; !ok {
        continue
    }
}

(This is the same code — I mean expressing it as if existingTokens != nil && !containsKey(existingTokens, key) { continue } or a named helper.) Minor, but the nesting makes the "skip" logic easy to miss.


6. internal/ai/session_test.go — persisted session path assertion is fragile

sessionFile := filepath.Join(tmpDir, ".config", "claws", "chat", "sessions", session.ID+".json")

This hard-codes the config-dir convention. If XDG_CONFIG_HOME or OS-specific logic ever shifts the path, the test will silently write to the wrong place and not catch the real path. Consider deriving the expected path from the same function the session manager uses to write the file.


Minor nits

  • hasSensitiveLabelField iterates the whole map to find a label field, then redactSensitiveValue iterates it again. A single pass that collects both would halve the traversal. Low priority.
  • In custom/iam/policies/dao.go, the else branch after a successful version fetch sets PolicyDocumentStatus = enrichment.Fetched twice (once in the if err == nil && ... != nil branch, and once in the naked else). The second assignment is dead code — remove it to avoid confusion:
    if err == nil && versionOutput.PolicyVersion != nil && versionOutput.PolicyVersion.Document != nil {
        res.PolicyDocument = *versionOutput.PolicyVersion.Document
        res.PolicyDocumentStatus = enrichment.Fetched
    } else if err != nil {
        res.PolicyDocumentStatus = enrichment.FailureStatus(err)
    } else {
        res.PolicyDocumentStatus = enrichment.Fetched  // ← same as the first branch; dead path
    }
    The final else covers err == nil && (version == nil || document == nil) — arguably that should be enrichment.NotConfigured or left as Unknown rather than Fetched.

What's Good

  • parseRuleID uses strings.LastIndex — correctly handles partner event bus ARNs like aws.partner/example.com/account/rule-name. The test covers this case explicitly.
  • enrichment.Status as a typed string is a clean, low-ceremony abstraction. The package-level FailureStatus, IsFailure, and Display helpers avoid duplication across IAM and S3 renderers.
  • The VHS → Nix migration removes the Linux-only constraint and simplifies the workflow considerably. The cachix/install-nix-action is pinned to a commit SHA — good CI hygiene.
  • The TestIsSensitiveRawKeyAvoidsSubstringFalsePositives test specifically validates that segment-based matching doesn't over-match (tokenization, credentialsexpiry). This is exactly the right kind of regression test for a fuzzy-matching routine.
  • SageMaker nil guard is a straightforward safety fix with no behaviour change for valid data.

@yimsk yimsk merged commit a8285ce into main May 1, 2026
12 checks passed
tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request May 7, 2026
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [github:clawscli/claws](https://github.com/clawscli/claws) | patch | `0.15.3` → `0.15.4` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>clawscli/claws (github:clawscli/claws)</summary>

### [`v0.15.4`](https://github.com/clawscli/claws/releases/tag/v0.15.4)

[Compare Source](clawscli/claws@v0.15.3...v0.15.4)

#### Changelog

- [`1ad7a57`](clawscli/claws@1ad7a57) Add gopls to dev shell
- [`ea05c51`](clawscli/claws@ea05c51) Address review feedback
- [`45b0ce4`](clawscli/claws@45b0ce4) Bound tag search fetch concurrency
- [`dcdf036`](clawscli/claws@dcdf036) Clarify pagination token filtering
- [`87d4e15`](clawscli/claws@87d4e15) Correct AI chat session docs path
- [`54e37d2`](clawscli/claws@54e37d2) Fix stuck AWS initializing status during profile change
- [`adfd5ba`](clawscli/claws@adfd5ba) Guard SageMaker training S3 URI rendering
- [`c69e666`](clawscli/claws@c69e666) Harden AWS docs search privacy
- [`6aa41e7`](clawscli/claws@6aa41e7) Harden exec actions and pagination
- [`379e4db`](clawscli/claws@379e4db) Harden safety-sensitive AWS operations
- [`c7628dc`](clawscli/claws@c7628dc) Limit pagination to token-bearing pairs
- [`bb97771`](clawscli/claws@bb97771) Merge branch 'main' into dependabot/go\_modules/gopkg.in/ini.v1-1.67.2
- [`a8285ce`](clawscli/claws@a8285ce) Merge pull request [#&#8203;180](clawscli/claws#180) from clawscli/develop
- [`f772864`](clawscli/claws@f772864) Merge pull request [#&#8203;181](clawscli/claws#181) from clawscli/develop
- [`493b81c`](clawscli/claws@493b81c) Merge pull request [#&#8203;182](clawscli/claws#182) from clawscli/dependabot/go\_modules/aws-sdk-98bf9e9202
- [`97f5f31`](clawscli/claws@97f5f31) Merge pull request [#&#8203;183](clawscli/claws#183) from clawscli/dependabot/go\_modules/gopkg.in/ini.v1-1.67.2
- [`40ee108`](clawscli/claws@40ee108) Merge pull request [#&#8203;184](clawscli/claws#184) from clawscli/develop
- [`48fc777`](clawscli/claws@48fc777) Normalize VHS shell prompts
- [`8df4fa6`](clawscli/claws@8df4fa6) Normalize genimports path tests
- [`9adfd42`](clawscli/claws@9adfd42) Preserve sensitive raw data key names
- [`7a5a6d8`](clawscli/claws@7a5a6d8) Redact sensitive AI raw resource data
- [`b0fa729`](clawscli/claws@b0fa729) Refine AI raw data redaction
- [`3898f5e`](clawscli/claws@3898f5e) Remove Claude code review workflows
- [`8eaad51`](clawscli/claws@8eaad51) Run VHS demos with host binary
- [`ddd062f`](clawscli/claws@ddd062f) Simplify enrichment status rendering
- [`0c83fd1`](clawscli/claws@0c83fd1) Track IAM policy enrichment failures
- [`55d033d`](clawscli/claws@55d033d) Track IAM role enrichment failures
- [`118836a`](clawscli/claws@118836a) Track IAM user enrichment failures
- [`dca965a`](clawscli/claws@dca965a) Track S3 enrichment failures
- [`f6b1043`](clawscli/claws@f6b1043) Update Go dependencies
- [`dd3ae9c`](clawscli/claws@dd3ae9c) Use bus-qualified EventBridge rule IDs
- [`f2854d5`](clawscli/claws@f2854d5) deps: bump gopkg.in/ini.v1 from 1.67.1 to 1.67.2
- [`5775c5c`](clawscli/claws@5775c5c) deps: bump the aws-sdk group with 5 updates

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - At any time (no schedule defined)
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever MR is behind base branch, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Mend Renovate](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xNTAuMCIsInVwZGF0ZWRJblZlciI6IjQzLjE1MC4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiLCJhdXRvbWF0aW9uOmJvdC1hdXRob3JlZCIsImRlcGVuZGVuY3ktdHlwZTo6cGF0Y2giXX0=-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant