Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions .claude/commands/ci/analyze-failures.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,11 @@ https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-

**Example Jenkins URLs**:
```
https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/image-consistency-check/3436/
https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/zstreams/job/Stage-Pipeline/1413/
```

> **Note:** Image consistency check has been migrated from Jenkins to Prow. Prow job URLs follow the Prow URL pattern above.

## Action to Take

**Step 1**: Analyze the provided URL against the detection rules above.
Expand All @@ -54,7 +55,7 @@ https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/zstreams/job/St

Please specify which type of CI job this is:
1. **Prow job** - OpenShift CI Prow jobs (from qe-private-deck or prow.ci.openshift.org)
2. **Jenkins job** - Jenkins jobs (image-consistency-check, stage-testing, etc.)
2. **Jenkins job** - Jenkins jobs (stage-testing, etc.)

Or provide a more complete URL if the one given was truncated.
```
Expand Down
18 changes: 10 additions & 8 deletions .claude/commands/ci/analyze-jenkins-failures.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
---
description: Analyze Jenkins job failures (image-consistency-check, stage-testing) using AI
description: Analyze Jenkins job failures (stage-testing) using AI
---

You are helping the user analyze failures from a Jenkins job run (typically image-consistency-check or stage-testing jobs used in z-stream release testing).
You are helping the user analyze failures from a Jenkins job run (typically stage-testing jobs used in z-stream release testing).

> **Note:** Image-consistency-check has been migrated from Jenkins to Prow. For Prow job failures, use `/ci:analyze-failures` instead.

The user has provided a Jenkins job URL: {{args}}

## Overview

Jenkins jobs used in OAR z-stream release workflow:
- **image-consistency-check**: Verifies payload image consistency
- **stage-testing** (Stage-Pipeline): Runs E2E tests for optional operators shipped with Openshift

> **Note:** image-consistency-check has been migrated to Prow and is no longer a Jenkins job.

Each job has its own custom console log format. This command fetches the raw console log and analyzes it based on the job type.

## Steps
Expand All @@ -23,7 +26,7 @@ Expected URL patterns:

Extract:
- **Base URL**: e.g., `https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com`
- **Job name**: e.g., `image-consistency-check` or `zstreams/Stage-Pipeline`
- **Job name**: e.g., `zstreams/Stage-Pipeline`
- **Build number**: e.g., `3436`

### 2. Fetch Job Parameters via API
Expand Down Expand Up @@ -908,7 +911,7 @@ Present findings in a clear, actionable format:
# Jenkins Job Failure Analysis Summary

## Job Details
- **Type**: {image-consistency-check | stage-testing}
- **Type**: stage-testing
- **Build**: #{build_number}
- **Status**: {status}
- **URL**: {jenkins_url}
Expand Down Expand Up @@ -995,7 +998,6 @@ No critical actions required. Monitor for:
**Context**:
- These jobs are part of OAR z-stream release workflow
- Triggered by commands like:
- `oar -r 4.19.1 image-consistency-check`
- `oar -r 4.19.1 stage-testing`
- Failures may block release approval

Expand All @@ -1008,11 +1010,11 @@ No critical actions required. Monitor for:
## Example Usage

```bash
/ci:analyze-jenkins-failures https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/image-consistency-check/3436/
/ci:analyze-jenkins-failures https://jenkins-csb-openshift-qe-mastern.dno.corp.redhat.com/job/zstreams/job/Stage-Pipeline/1413/
```

The command will:
1. Fetch console log from public endpoint
2. Detect it's an image-consistency-check job
2. Detect it's a stage-testing job
3. Parse the structured output sections
4. Provide analysis and recommendation
49 changes: 25 additions & 24 deletions .claude/commands/release/drive.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,7 +253,7 @@ For complete step-by-step logic, read **`docs/KONFLUX_RELEASE_FLOW.md`**:

**Async Task Monitoring:**
- Re-execute the same MCP tool to check status
- Example: `oar_image_consistency_check(release, build_number=123)` to check progress
- Example: `oar_image_consistency_check(release, job_id="uuid")` to check progress

## Key Decision Points

Expand Down Expand Up @@ -384,7 +384,7 @@ state = oar_get_release_status(release="4.20.1")
"status": "In Progress",
"started_at": "2025-01-15T14:00:00Z",
"completed_at": null,
"result": "Jenkins job #123 triggered..."
"result": "Prow job triggered..."
}
],
"issues": [
Expand Down Expand Up @@ -459,17 +459,18 @@ elif task["status"] == "Pass":
Continue to next task

elif task["status"] == "In Progress":
# Check if async task (Jenkins jobs)
# Check if async task (Prow/Jenkins jobs)
if task_name in ["image-consistency-check", "stage-testing"]:
# Extract build number from task result
build_number = extract_from_result(task["result"], r"Build number: (\d+)")
# Extract job ID from task result
# image-consistency-check uses Prow job ID, stage-testing uses Jenkins build number
job_id = extract_from_result(task["result"], r"job ID: (\S+)") or extract_from_result(task["result"], r"Build number: (\d+)")

if not build_number:
Log: f"⚠ {task_name} in progress but no build number found, retrying..."
if not job_id:
Log: f"⚠ {task_name} in progress but no job ID found, retrying..."
Execute task_name
else:
# Query Jenkins job status
result = execute_mcp_tool(task_name, build_number=build_number)
# Query job status (Prow or Jenkins depending on task)
result = execute_mcp_tool(task_name, job_id=job_id)

if "status is changed to [Pass]" in result:
Log: f"✓ {task_name} completed successfully"
Expand All @@ -478,7 +479,7 @@ elif task["status"] == "In Progress":
Log: f"✗ {task_name} failed"
STOP pipeline
else:
Log: f"⏳ {task_name} still running (job #{build_number})"
Log: f"⏳ {task_name} still running (job {job_id})"
Ask user to check back later
RETURN
else:
Expand Down Expand Up @@ -506,36 +507,36 @@ elif task["status"] == "Fail":

### Async Task Monitoring

**For long-running Jenkins tasks:**
**For long-running async tasks:**

```python
# Initial trigger (when task doesn't exist or has no build number)
# Initial trigger (when task doesn't exist or has no job ID)
result = oar_image_consistency_check(release=release)

if "Build number:" in result:
build_number = extract_build_number(result)
Log: f"⏳ Jenkins job #{build_number} triggered"
if "Prow job" in result:
job_id = extract_job_id(result)
Log: f"⏳ Prow job {job_id} triggered"
Log: "Check back in 20-30 minutes with: /release:drive {release}"
RETURN

# Status check on resume (when task has build number in result)
result = oar_image_consistency_check(release=release, build_number=build_number)
# Status check on resume (when task has job ID in result)
result = oar_image_consistency_check(release=release, job_id=job_id)

if "status is changed to [Pass]" in result:
Log: f"✓ Job #{build_number} completed successfully"
Log: f"✓ Job {job_id} completed successfully"
Continue to next task
elif "status is changed to [Fail]" in result:
# Add issue to StateBox
oar_add_issue(
release=release,
issue=f"image-consistency-check job #{build_number} failed: {extract_failure_reason(result)}",
issue=f"image-consistency-check Prow job {job_id} failed: {extract_failure_reason(result)}",
blocker=True,
related_tasks=["image-consistency-check"]
)
Log: "✗ Job failed, blocker added to StateBox"
STOP pipeline
else:
Log: f"⏳ Job #{build_number} still running..."
Log: f"⏳ Job {job_id} still running..."
RETURN
```

Expand Down Expand Up @@ -645,8 +646,8 @@ AI: Resuming from PHASE 2...
AI: ✓ Skipping 2 completed tasks (take-ownership, check-cve-tracker-bug)
AI: ⏳ push-to-cdn-staging still running (job #456)
AI: ✓ Build promoted! Phase: PHASE 3 - Test Evaluation
AI: ⏳ image-consistency-check triggered (job #789)
AI: ⏳ stage-testing triggered (job #790)
AI: ⏳ image-consistency-check triggered (Prow job abc-123-def)
AI: ⏳ stage-testing triggered (Jenkins job #790)
AI: Waiting for test results, check back in 1 hour
```

Expand All @@ -657,8 +658,8 @@ AI: Loading StateBox state for 4.20.1...
AI: Resuming from PHASE 4...
AI: ✓ Skipping 4 completed tasks
AI: ✓ push-to-cdn-staging completed (job #456)
AI: ✓ image-consistency-check completed (job #789)
AI: ✓ stage-testing completed (job #790)
AI: ✓ image-consistency-check completed (Prow job abc-123-def)
AI: ✓ stage-testing completed (Jenkins job #790)
AI: Analyzing promoted build test results...
AI: ✓ All tests passed, proceeding to PHASE 5
AI: ✓ image-signed-check completed
Expand Down
2 changes: 1 addition & 1 deletion .claude/skills/release-workflow/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ WHEN trigger phase:
Report: "Task triggered, check status in X minutes"

WHEN check phase:
Execute command with build_number
Execute command with job ID (Prow job ID or Jenkins build number)
IF status == "In Progress":
Report: "Task still running"
ELSE IF status == "Pass":
Expand Down
26 changes: 13 additions & 13 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,23 +403,24 @@ oar -r <release-version> [OPTIONS] COMMAND [ARGS]
**Command:**
```bash
oar -r <release> image-consistency-check
oar -r <release> image-consistency-check -n <build-number>
oar -r <release> image-consistency-check -i <job-id>
```

**Purpose:** Verifies that images in the release payload are consistent with advisory contents.
**Purpose:** Verifies that images in the release payload are consistent with images in the shipment.

**Options:**
- `-n, --build-number` - Jenkins build number to check status (for subsequent runs)
- `-i, --job-id` - Prow job ID to check status (for subsequent runs)

**What it does:**
- Triggers a Jenkins job to verify image consistency
- Compares images in release payload with images in advisories
- Returns build number on first run
- Can check job status on subsequent runs with build number
- Triggers a Prow job via Gangway API to verify image consistency
- Compares images in release payload with images in shipment MR
- Returns Prow job ID on first run
- Can check job status on subsequent runs with job ID
- Requires `APITOKEN` environment variable for Prow authentication

**Workflow:**
1. First run: Triggers job, returns build number
2. Subsequent runs: Check status using `-n <build-number>`
1. First run: Triggers Prow job, returns job ID
2. Subsequent runs: Check status using `-i <job-id>`

---

Expand Down Expand Up @@ -792,18 +793,17 @@ All core modules follow a consistent pattern:

**Key Functionality:**
- Trigger stage testing pipeline
- Trigger image consistency check jobs
- Monitor job queue and execution
- Validate job parameters match release version
- Get build status with detailed error handling

**Supported Jobs:**
- `stage-pipeline` - Stage environment testing
- `image-consistency-check` - Verify payload images match advisories

**Note:** Image consistency check has been migrated to Prow (see `prow/job/job.py` `run_image_consistency_check`).

**Key Methods:**
- `call_stage_job()` - Trigger stage testing
- `call_image_consistency_job()` - Trigger image consistency validation
- `get_build_status()` - Check job status by build number
- `is_job_enqueue()` - Check if job is queued

Expand Down Expand Up @@ -1124,5 +1124,5 @@ When adding support for new OpenShift versions, update:
2. Job registry configurations
3. Test report templates
4. Add new ci-profile for stage-testing pipeline
5. Add new release version to parameter `VERSION` of image-consistency-check job
5. Update Prow job configuration for image-consistency-check
6. Update configstore config to add new test template doc ID and slack group alias for release leads
9 changes: 5 additions & 4 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ The MCP (Model Context Protocol) server (`mcp_server/server.py`) exposes OAR com

**Categories of tools:**
1. **Read-only tools** - Safe query operations (check-greenwave-cvp-tests, check-cve-tracker-bug, image-signed-check, is-release-shipped)
2. **Status check tools** - Query job status (image-consistency-check -n, stage-testing -n)
2. **Status check tools** - Query job status (image-consistency-check -i, stage-testing -n)
3. **Write operations** - Modify state (create-test-report, update-bug-list, take-ownership)
4. **Critical operations** - Production impact (push-to-cdn-staging, change-advisory-status)
5. **Controller tools** - Background agents (start-release-detector, jira-notificator)
Expand Down Expand Up @@ -463,7 +463,7 @@ oar -r 4.19.1 update-bug-list

# 4. Verify payload images
oar -r 4.19.1 image-consistency-check
oar -r 4.19.1 image-consistency-check -n <build-number> # Check status
oar -r 4.19.1 image-consistency-check -i <job-id> # Check status

# 5. Validate CVP tests
oar -r 4.19.1 check-greenwave-cvp-tests
Expand Down Expand Up @@ -510,8 +510,9 @@ When adding new version support, update:
1. Jira query filters (`oar/notificator/jira_notificator.py`)
2. Job registry configurations
3. Test report templates
4. Jenkins job parameters (stage-testing, image-consistency-check)
5. ConfigStore config (test template doc ID, Slack group alias)
4. Jenkins job parameters (stage-testing)
5. Prow job configuration (image-consistency-check)
6. ConfigStore config (test template doc ID, Slack group alias)

## Authentication Notes

Expand Down
6 changes: 4 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,10 @@ RUN case ${TARGETARCH} in \
# Install OAR CLI
WORKDIR /usr/src/release-tests
COPY . .
RUN uv pip install --python ${PY_BIN} --system . && \
RUN uv pip install --python ${PY_BIN} --system . ./prow && \
oar --help && \
oarctl --help
oarctl --help && \
job --help && \
jobctl --help

CMD [ "/bin/bash" ]
22 changes: 11 additions & 11 deletions docs/KONFLUX_RELEASE_FLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -649,11 +649,11 @@ OR

**Purpose:** Verify image consistency across architectures

**MCP Tool:** `oar_image_consistency_check(release, build_number=None)`
**MCP Tool:** `oar_image_consistency_check(release, job_id=None)`

**Input:**
- `release`: Z-stream version
- `build_number`: Optional Jenkins build number (for status check)
- `job_id`: Optional Prow job ID (for status check)

**Prerequisites:**
- Build promotion detected (phase == "Accepted")
Expand All @@ -667,10 +667,10 @@ Execute: oar_image_consistency_check(release)

# Possible outcomes:

# Success - Jenkins job triggered:
# Success - Prow job triggered:
stdout contains: "task [Image consistency check] status is changed to [In Progress]"
AND
Capture Jenkins build number from stdout pattern
Capture Prow job ID from stdout pattern

# OR

Expand All @@ -696,10 +696,10 @@ IF stage-release pipeline error detected:
RETURN (do not mark as failed - this is a prerequisite wait state)
```

**Phase 2 - Check Status (when build_number available):**
**Phase 2 - Check Status (when job_id available):**
```python
When user invokes /release:drive:
Execute: oar_image_consistency_check(release, build_number={captured_build_number})
Execute: oar_image_consistency_check(release, job_id={captured_job_id})
Check stdout for status update
```

Expand Down Expand Up @@ -1177,13 +1177,13 @@ Report to user: "Build promoted (phase: Accepted)! Triggering async tasks now...
oar_image_consistency_check(release="4.20.1")
oar_stage_testing(release="4.20.1")

# Capture Jenkins build numbers
consistency_build = parse_build_number(stdout)
stage_build = parse_build_number(stdout)
# Capture job IDs
consistency_job_id = parse_job_id(stdout) # Prow job ID
stage_build = parse_build_number(stdout) # Jenkins build number

Report to user: """
2 async tasks triggered:
- image-consistency-check (build #{consistency_build})
- image-consistency-check (Prow job ID: {consistency_job_id})
- stage-testing (build #{stage_build})

These tasks are now running in parallel with test result analysis.
Expand All @@ -1198,7 +1198,7 @@ RETURN
```python
# First check async task status
oar_push_to_cdn_staging(release="4.20.1") # Check status
oar_image_consistency_check(release="4.20.1", build_number=consistency_build)
oar_image_consistency_check(release="4.20.1", job_id=consistency_job_id)
oar_stage_testing(release="4.20.1", build_number=stage_build)

async_tasks_status = {
Expand Down
Loading