feat: cloudgov platform audit — Platform-tenant conformance auditor#8
Merged
Conversation
Adds the independent-auditor capability (Phase 2): cloudgov now verifies
that deployed nanohype Platform tenants still match the eks-agent-platform
contract. Read-only — the operator enforces; this catches drift, manual
tampering, and reconcile gaps, and produces evidence the fab merge-gate
can cite.
─── Auditor ───
- internal/platform: lists Platform CRs (agents.stxkxs.io/v1alpha1) via a
dynamic client and, for each Ready/Suspended platform, verifies its
tenant namespace against the operator's contract:
- namespace exists with pod-security.kubernetes.io/enforce=restricted
and the eks-agent-platform/{platform,tenant,persona} labels
- tenant-default ResourceQuota and LimitRange present
- tenant-egress NetworkPolicy present, egress-typed (ingress
default-deny preserved), applied namespace-wide
- tenant-runtime ServiceAccount carries the eks.amazonaws.com/role-arn
annotation matching Platform.status.iamRoleArn
- spec.identity declares exactly one of allowedModels /
allowedModelFamilies
Not-yet-Ready platforms are skipped with an informational note.
- internal/cloud/k8s: NewClients returns a typed clientset + dynamic
client from the same kubeconfig chain, so the auditor reads CRs
alongside core objects. No new dependency — reuses client-go.
- internal/cloud: PlatformFinding + PlatformFindingType.
─── Surfaces ───
- cmd/platform.go: `cloudgov platform audit`
(--kubeconfig / --output table|json|sarif / --severity), wired into the
--fail-on severity gate.
- output: WritePlatform (JSON), PlatformFindings (table),
WritePlatformSARIF.
- MCP: a 16th tool, platform_audit; AGENTS.md updated.
─── Tests ───
- internal/platform/audit_test.go: fake typed + dynamic clients cover
conformant, drift (missing NetworkPolicy/ServiceAccount, wrong PSS),
namespace-missing, not-ready-skip, and identity-invalid cases.
Verification: go build ./..., go test ./..., go vet ./..., and
golangci-lint v2.12.2 (uncapped) all pass. The binary advertises the
sarif format for platform audit and the MCP server exposes 16 tools
including platform_audit.
This is the K8s-side slice of the auditor; AWS-side IRSA/KMS/trust-policy
conformance and budget cross-references (SOC2 kill-switch, Platform <=
Tenant compliance strictness) are the next slice.
Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
This was referenced May 30, 2026
stxkxs
added a commit
that referenced
this pull request
May 30, 2026
…I group Completes the Platform-tenant auditor with the budget + compliance cross-resource checks, and corrects the custom-resource API group that slices #8/#9 hardcoded wrong. ─── Bug fix (affects the already-merged auditor) ─── The Platform and Tenant CRs live in the platform.nanohype.dev API group and BudgetPolicy in governance.nanohype.dev — NOT agents.stxkxs.io, which the auditor's platformGVR hardcoded. Against a real cluster, `platform audit` listed zero Platforms (the GVR matched nothing). Corrected platformGVR and added the tenant + budget GVRs. The earlier unit tests passed only because their fixtures used the same wrong group; the fixtures now use the real groups, so they actually exercise the shipping GVR. ─── Budget + compliance cross-references ─── - internal/platform: auditBudgetCompliance runs for every Platform (spec consistency, independent of phase) and reports: - spec.budget.name empty or pointing at a BudgetPolicy that doesn't exist (BUDGET_POLICY_MISSING) - a SOC2 platform whose referenced BudgetPolicy has killSwitchEnabled=false (KILL_SWITCH_DISABLED) - a Platform less strict than its owning Tenant — Tenant requires soc2/hipaa but the Platform doesn't (COMPLIANCE_WEAKER_THAN_TENANT) - spec.tenant pointing at a Tenant CR that doesn't exist (TENANT_MISSING) BudgetPolicy is looked up in the Platform's namespace; the Tenant CR is cluster-scoped. Reuses the dynamic client already threaded through Audit. - internal/cloud: the four new finding types; SARIF rules extended. ─── Tests ─── - audit_test.go: fixtures corrected to the real API groups and given a matching BudgetPolicy + Tenant; new cases cover budget-missing, kill-switch-disabled, and compliance-weaker-than-tenant (10 tests total). No new dependency. Verification: go build ./..., go test ./..., go vet ./..., and golangci-lint v2.12.2 (uncapped) all pass. This completes Phase 2: the Platform auditor now spans the cluster, AWS IRSA, and budget/compliance sides of the eks-agent-platform contract. Co-authored-by: stxkxsbot <275011021+stxkxsbot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See the commit message for full detail. First slice of Phase 2 (the independent auditor) from the approved plan.
Summary
cloudgov platform audit— reads everyPlatformCR (agents.stxkxs.io/v1alpha1) and verifies each tenant's live K8s state against the eks-agent-platform contract: namespace + PSS=restricted + ownership labels,tenant-defaultResourceQuota/LimitRange,tenant-egressNetworkPolicy (egress-typed, namespace-wide), andtenant-runtimeServiceAccount IRSA annotation vsstatus.iamRoleArn, plus thespec.identitymodel-list invariant. Read-only — the operator enforces; this catches drift.--fail-ongate, table/JSON/SARIF output, and a newplatform_auditMCP tool (16th). No new dependency — uses the existing client-go (typed + dynamic clients).Verification
go build,go test ./...,go vet, golangci-lint v2.12.2 (uncapped) all pass. Binary advertisessariffor platform audit; MCP exposes 16 tools incl.platform_audit.Next slice
AWS-side conformance (IRSA role / baseline Bedrock policy / suspension tags / KMS grant / trust policy) and budget cross-refs (SOC2 kill-switch, Platform ≤ Tenant compliance).