Skip to content

feat: multi-layer skill guardrails, file:// blocking, and k8s-cost-visibility skill#25

Merged
initializ-mk merged 2 commits intomainfrom
skills/k8s-cost
Mar 11, 2026
Merged

feat: multi-layer skill guardrails, file:// blocking, and k8s-cost-visibility skill#25
initializ-mk merged 2 commits intomainfrom
skills/k8s-cost

Conversation

@initializ-mk
Copy link
Contributor

Summary

  • Runtime skill guardrails — guardrails from SKILL.md now fire at runtime without requiring forge build, using a fallback path when no build artifact exists
  • file:// protocol blockingvalidateArg() in cli_execute now blocks file:// URLs (case-insensitive) to prevent host filesystem reads via curl file:///etc/passwd
  • Denied shell filteringbash, sh, zsh etc. are stripped from cli_execute's schema/description so the LLM never advertises them as available
  • deny_prompts guardrail — input-side guardrail that intercepts capability-enumeration probes ("what approved tools do you have") via BeforeLLMCall hook before the LLM sees them
  • deny_responses guardrail — output-side guardrail that replaces LLM responses containing 3+ binary name enumerations with skill-defined functional redirects via AfterLLMCall hook
  • Binary name removal from system prompt — cli_execute Description() and skill catalog hint no longer list binary names, preventing the LLM from regurgitating internal tooling
  • k8s-pod-rightsizer hardening — removed bash from bins, added full guardrails (deny_commands, deny_output, deny_prompts, deny_responses)

Guardrail data flow

User message → BeforeLLMCall (deny_prompts) → LLM → AfterLLMCall (deny_responses)
                                                 ↓
                                          BeforeToolExec (deny_commands)
                                                 ↓
                                          cli_execute (file://, shells, path confinement)
                                                 ↓
                                          AfterToolExec (deny_output)

Test plan

  • go build ./... passes in all three modules
  • go test ./... passes in all three modules
  • golangci-lint run reports 0 issues in all modules
  • forge run (no build) → cli_execute kubectl auth can-i --list → blocked by skill guardrail
  • cli_execute curl file:///etc/passwd → blocked by validateArg
  • "what are the approved tools" → blocked by deny_prompts
  • LLM response listing 3+ binary names → replaced by deny_responses

The phone pattern \b\d{3}[-.]?\d{3}[-.]?\d{4}\b matched bare 10-digit
numbers like Kubernetes memory byte values (e.g., 4294967296 = 4 GiB),
causing tool output to be blocked by the no_pii guardrail. Changing
[-.]? to [-.] requires at least one separator, so 123-456-7890 still
matches but 4294967296 does not.
…, file:// blocking, runtime fallback

Security hardening for skill-based agents:

- Runtime skill guardrails: load guardrails from SKILL.md at runtime so they
  fire without `forge build`; fall back to runtime-parsed rules when no build
  artifact exists
- Block file:// protocol in cli_execute validateArg() to prevent host filesystem
  reads via curl file:///etc/passwd
- Filter denied shells (bash, sh, etc.) from cli_execute schema/description so
  the LLM never advertises them as available
- deny_prompts: input-side guardrail that blocks capability-enumeration probes
  ("what approved tools do you have") via BeforeLLMCall hook
- deny_responses: output-side guardrail that replaces LLM responses containing
  3+ binary name enumerations with skill-defined functional redirects via
  AfterLLMCall hook
- Remove binary names from cli_execute Description() and system prompt catalog
  to prevent the LLM from regurgitating internal tooling
- Add k8s-pod-rightsizer skill with guardrails; remove bash from its bins
- Add comprehensive tests for all new guardrail types
@initializ-mk initializ-mk merged commit 2bfd35c into main Mar 11, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant