-
Notifications
You must be signed in to change notification settings - Fork 49
[WIP] MG-108: evals for plan_mustgather #162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-audit-logs | ||
| difficulty: easy | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing must-gather resources from previous runs | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Find must-gather pod and verify it uses gather_audit_logs command | ||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||
| if [ -z "$POD_JSON" ]; then | ||
| echo "No must-gather pod found" | ||
| exit 1 | ||
| fi | ||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||
| # Verify pod uses gather_audit_logs command | ||
| kubectl get pod "$POD" -n "$NS" -o yaml | grep -q "gather_audit_logs" | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete all must-gather namespaces and clusterrolebindings | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: I need to collect audit logs from my OpenShift cluster for a security investigation. Please set up a must-gather that specifically gathers audit logs and apply it. | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-custom-images | ||
| difficulty: medium | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing must-gather resources from previous runs | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Find must-gather pod and verify it uses custom images | ||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||
| if [ -z "$POD_JSON" ]; then | ||
| echo "No must-gather pod found" | ||
| exit 1 | ||
| fi | ||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||
| # Verify pod uses the logging operator image | ||
| kubectl get pod "$POD" -n "$NS" -o yaml | grep -q "cluster-logging-rhel9-operator" | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete all must-gather namespaces and clusterrolebindings | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: I'm debugging logging issues on my OpenShift 4.15 cluster. Set up and apply a must-gather using both the platform image registry.redhat.io/openshift4/ose-must-gather:v4.15 and the logging operator image registry.redhat.io/openshift-logging/cluster-logging-rhel9-operator:latest. | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,40 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-custom-namespace | ||
| difficulty: easy | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing resources | ||
| kubectl delete ns my-debug-namespace --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep my-debug-namespace-must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Verify resources exist in the custom namespace | ||
| kubectl get ns my-debug-namespace | ||
| kubectl get sa must-gather-collector -n my-debug-namespace | ||
| kubectl get pod -n my-debug-namespace -o name | grep -q must-gather | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete the custom namespace and clusterrolebinding | ||
| kubectl delete ns my-debug-namespace --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep my-debug-namespace-must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: I want to run a must-gather but need it in a specific namespace called "my-debug-namespace" due to our cluster policies. Can you set this up and apply it? | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-default | ||
| difficulty: easy | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing must-gather resources from previous runs | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Find must-gather pod and verify it exists with default image | ||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||
| if [ -z "$POD_JSON" ]; then | ||
| echo "No must-gather pod found" | ||
| exit 1 | ||
| fi | ||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||
| # Verify pod uses default must-gather image | ||
| kubectl get pod "$POD" -n "$NS" -o yaml | grep -q "ose-must-gather" | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete all must-gather namespaces and clusterrolebindings | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: I need to collect diagnostic data from my OpenShift cluster for a support case. Can you help me plan and apply a must-gather collection? | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-host-network | ||
| difficulty: easy | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing must-gather resources from previous runs | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Find must-gather pod and verify it has hostNetwork enabled | ||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||
| if [ -z "$POD_JSON" ]; then | ||
| echo "No must-gather pod found" | ||
| exit 1 | ||
| fi | ||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||
| # Verify pod has hostNetwork: true | ||
| kubectl get pod "$POD" -n "$NS" -o yaml | grep -q "hostNetwork: true" | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete all must-gather namespaces and clusterrolebindings | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: I'm troubleshooting network connectivity issues on my cluster. Set up and apply a must-gather that uses host networking to capture network-level diagnostics, and keep the resources around after collection so I can inspect them. | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 |
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,46 @@ | ||||||||||||||||||||||||||||||
| kind: Task | ||||||||||||||||||||||||||||||
| metadata: | ||||||||||||||||||||||||||||||
| labels: | ||||||||||||||||||||||||||||||
| suite: openshift | ||||||||||||||||||||||||||||||
| name: plan-mustgather-node-selector | ||||||||||||||||||||||||||||||
| difficulty: medium | ||||||||||||||||||||||||||||||
| steps: | ||||||||||||||||||||||||||||||
| setup: | ||||||||||||||||||||||||||||||
| inline: |- | ||||||||||||||||||||||||||||||
| #!/usr/bin/env bash | ||||||||||||||||||||||||||||||
| set -euo pipefail | ||||||||||||||||||||||||||||||
| # Cleanup any existing must-gather resources from previous runs | ||||||||||||||||||||||||||||||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||||||||||||||||||||||||||||||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||||||||||||||||||||||||||||||
| verify: | ||||||||||||||||||||||||||||||
| inline: |- | ||||||||||||||||||||||||||||||
| #!/usr/bin/env bash | ||||||||||||||||||||||||||||||
| set -euo pipefail | ||||||||||||||||||||||||||||||
| # Find must-gather pod and verify it has the correct nodeSelector | ||||||||||||||||||||||||||||||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||||||||||||||||||||||||||||||
| if [ -z "$POD_JSON" ]; then | ||||||||||||||||||||||||||||||
| echo "No must-gather pod found" | ||||||||||||||||||||||||||||||
| exit 1 | ||||||||||||||||||||||||||||||
| fi | ||||||||||||||||||||||||||||||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||||||||||||||||||||||||||||||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||||||||||||||||||||||||||||||
|
Comment on lines
+20
to
+26
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Potential JSON parsing failure when multiple pods match. When multiple pods match the namespace prefix filter, Consider using Proposed fix- POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))')
+ POD_JSON=$(kubectl get pods -A -o json | jq '[.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))] | first')
if [ -z "$POD_JSON" ]; then
echo "No must-gather pod found"
exit 1
fi
- NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1)
- POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1)
+ NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace')
+ POD=$(echo "$POD_JSON" | jq -r '.metadata.name')Note: This same pattern appears in all other task files in this PR. 📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||||||||||||||||||
| # Verify pod has worker node selector | ||||||||||||||||||||||||||||||
| kubectl get pod "$POD" -n "$NS" -o yaml | grep -q "node-role.kubernetes.io/worker" | ||||||||||||||||||||||||||||||
| cleanup: | ||||||||||||||||||||||||||||||
| inline: |- | ||||||||||||||||||||||||||||||
| #!/usr/bin/env bash | ||||||||||||||||||||||||||||||
| set -euo pipefail | ||||||||||||||||||||||||||||||
| # Delete all must-gather namespaces and clusterrolebindings | ||||||||||||||||||||||||||||||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||||||||||||||||||||||||||||||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||||||||||||||||||||||||||||||
| prompt: | ||||||||||||||||||||||||||||||
| inline: I need to collect diagnostics but only want the must-gather pod to run on worker nodes, not on control plane nodes. The node selector should be node-role.kubernetes.io/worker. Can you set this up and apply it? | ||||||||||||||||||||||||||||||
| assertions: | ||||||||||||||||||||||||||||||
| promptsUsed: | ||||||||||||||||||||||||||||||
| - server: kubernetes | ||||||||||||||||||||||||||||||
| prompt: plan_mustgather | ||||||||||||||||||||||||||||||
| toolsUsed: | ||||||||||||||||||||||||||||||
| - server: kubernetes | ||||||||||||||||||||||||||||||
| tool: resource_create_or_update | ||||||||||||||||||||||||||||||
| minToolCalls: 1 | ||||||||||||||||||||||||||||||
| maxToolCalls: 15 | ||||||||||||||||||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| kind: Task | ||
| metadata: | ||
| labels: | ||
| suite: openshift | ||
| name: plan-mustgather-timeout-since | ||
| difficulty: medium | ||
| steps: | ||
| setup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Cleanup any existing must-gather resources from previous runs | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| verify: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Find must-gather pod and verify it has timeout and since configured | ||
| POD_JSON=$(kubectl get pods -A -o json | jq -r '.items[] | select(.metadata.namespace | startswith("openshift-must-gather"))') | ||
| if [ -z "$POD_JSON" ]; then | ||
| echo "No must-gather pod found" | ||
| exit 1 | ||
| fi | ||
| NS=$(echo "$POD_JSON" | jq -r '.metadata.namespace' | head -1) | ||
| POD=$(echo "$POD_JSON" | jq -r '.metadata.name' | head -1) | ||
| POD_YAML=$(kubectl get pod "$POD" -n "$NS" -o yaml) | ||
| # Verify pod has MUST_GATHER_SINCE env var | ||
| echo "$POD_YAML" | grep -q "MUST_GATHER_SINCE" | ||
| # Verify pod has timeout in command | ||
| echo "$POD_YAML" | grep -q "/usr/bin/timeout" | ||
| cleanup: | ||
| inline: |- | ||
| #!/usr/bin/env bash | ||
| set -euo pipefail | ||
| # Delete all must-gather namespaces and clusterrolebindings | ||
| kubectl get ns -o name 2>/dev/null | grep openshift-must-gather | xargs -r kubectl delete --wait=false 2>/dev/null || true | ||
| kubectl get clusterrolebinding -o name 2>/dev/null | grep must-gather-collector | xargs -r kubectl delete 2>/dev/null || true | ||
| prompt: | ||
| inline: My cluster had an incident about an hour ago. I need a must-gather that only collects logs and data from the last 2 hours to keep the archive small. Also set a 30 minute timeout so it doesn't run forever. Set this up and apply it. | ||
| assertions: | ||
| promptsUsed: | ||
| - server: kubernetes | ||
| prompt: plan_mustgather | ||
| toolsUsed: | ||
| - server: kubernetes | ||
| tool: resource_create_or_update | ||
| minToolCalls: 1 | ||
| maxToolCalls: 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verification only checks one of the two required images.
The prompt specifies two images should be configured (
ose-must-gather:v4.15andcluster-logging-rhel9-operator:latest), but the verification only checks for the logging operator image. This could lead to false positives if only one image is configured.Consider verifying both images
Also applies to: 36-37
🤖 Prompt for AI Agents