Skip to content

product-lifecycle: Add plc_lookup.py CLI, migrate to v2 API#13

Open
harche wants to merge 1 commit into
openshift:mainfrom
harche:skills/cluster-update-plc-script
Open

product-lifecycle: Add plc_lookup.py CLI, migrate to v2 API#13
harche wants to merge 1 commit into
openshift:mainfrom
harche:skills/cluster-update-plc-script

Conversation

@harche

@harche harche commented Apr 21, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Replaces inline curl/python one-liners in product-lifecycle SKILL.md with a standalone Python CLI (plc_lookup.py) wrapping the Red Hat Product Life Cycle API v2
  • Addresses review feedback from PR OTA-1963: skills: Add cluster-update skills (update-advisor, product-lifecycle) #6: v1→v2 migration, all 5 support status types, drop PLCC acronym, proper error handling
  • Adds pagination (--limit, --offset) for broad queries
  • Includes 46 tests (unit with mocked API + integration against live API)
  • Updates update-advisor SKILL.md cross-references

Builds on top of #6 (that PR stays open).

CLI usage

python3 cluster-update/product-lifecycle/scripts/plc_lookup.py products "logging for Red Hat OpenShift"
python3 cluster-update/product-lifecycle/scripts/plc_lookup.py products "OpenShift" --ocp 4.21 --limit 5
python3 cluster-update/product-lifecycle/scripts/plc_lookup.py olm-check --ocp 4.21 --operators '[{"package":"cluster-logging"}]'

Test plan

  • All 46 tests pass locally (python3 -m unittest — 19 unit + 17 integration against live API + 10 pagination)
  • Verify plc_lookup.py -h shows clean help output
  • Verify script works from repo root path
  • Verify SKILL.md paths resolve correctly

🤖 Generated with Claude Code

@openshift-ci openshift-ci Bot requested review from DavidHurta and mrunalp April 21, 2026 22:26
@openshift-ci

openshift-ci Bot commented Apr 21, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: harche

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 21, 2026
@harche

harche commented Apr 22, 2026

Copy link
Copy Markdown
Contributor Author

/hold

will look into this after #6

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 22, 2026
@harche harche force-pushed the skills/cluster-update-plc-script branch from ce82a3e to b0a3f09 Compare May 29, 2026 15:24
@harche

harche commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

Testing Results

Eval Framework (Agent-in-the-loop)

Ran 12 skill evals via the agentic-skills eval framework using Claude on Vertex AI. Test queries use the exact prompt format CVO proposals send to the agent, with real OLM operator data and readiness JSON.

product-lifecycle (7 evals)

Test Case Query Expected Result
plc_proposal_olm_batch_check 5-operator batch olm-check from proposal context 5 checked, 3 unavailable
plc_cluster_logging_supported cluster-logging v6.5 lifecycle + OCP 4.21 compat supported, compatible
plc_web_terminal_compat_check web-terminal supported version for OCP 4.21 no supported version for 4.21
plc_compliance_operator_status compliance-operator v1.9 lifecycle status found, supported
plc_ocp_platform_status OCP 4.21 platform lifecycle supported
plc_ocp_old_version_extended OCP 4.14 lifecycle phase extended
plc_batch_known_operators_only 2-operator batch check (cluster-logging + web-terminal) both found, neither EOL

update-advisor (5 evals)

Each test sends a complete readiness JSON (same format CVO produces) and verifies the agent's upgrade decision:

Test Case Scenario Expected Decision Result
advisor_healthy_cluster_recommend All 9 checks pass, no issues recommend
advisor_degraded_operator_caution 1 degraded operator (authentication) caution
advisor_api_deprecation_block Removed API with 1250 active requests block
advisor_etcd_unhealthy_block 1 of 3 etcd members not ready block
advisor_errored_checks_escalate 7 of 9 readiness checks errored (API unreachable) escalate

All 12 agent evals + 46 unit/integration tests passed.

End-to-End with CVO Readiness Data

Tested the full proposal → readiness → skill pipeline using a custom CVO build from openshift/cluster-version-operator#1395 deployed to a live OCP 4.21.5 cluster on GCP (6 nodes).

Installed OLM operators for testing:

  • cluster-logging v6.5.1 (channel: stable-6.5)
  • compliance-operator v1.9.0 (channel: stable)
  • openshift-pipelines-operator-rh v1.22.0 (channel: latest)
  • web-terminal v1.16.0 (channel: fast)
  • devworkspace-operator v0.41.0 (channel: fast)

Flow verified:

  1. CVO creates proposals with readiness JSON containing olm_operator_lifecycle data for all 5 operators
  2. The update-advisor skill analyzes readiness data and classifies findings per its decision policy
  3. The update-advisor calls product-lifecycle to cross-check operators against the PLC API
  4. plc_lookup.py olm-check correctly returns lifecycle data for cluster-logging and web-terminal, reports "unavailable" for compliance-operator, openshift-pipelines, and devworkspace-operator
  5. plc_lookup.py products finds compliance-operator by product name even when olm-check can't match by package name

Cluster readiness summary (from proposal ota-4-21-5-to-4-21-16):

  • 9/9 readiness checks passed in 2.56s
  • 34 ClusterOperators healthy, 6 nodes ready, 3/3 etcd members
  • 207 CRDs (0 version issues), 21 PDBs (0 blocking)
  • OVNKubernetes, TLS Intermediate, no deprecated APIs
  • 5 OLM operators detected with versions, channels, and install status

Ground truth verification

All expected values in evals were verified against the live PLC v2 API:

$ plc_lookup.py products "Red Hat OpenShift Container Platform" --ocp 4.21  →  4.21: supported
$ plc_lookup.py products "Red Hat OpenShift Container Platform" --ocp 4.14  →  4.14: extended
$ plc_lookup.py products "logging for Red Hat OpenShift" --ocp 4.21         →  6.5: supported, ocp_compatible=true
$ plc_lookup.py products "compliance operator"                              →  v1.9: supported
$ plc_lookup.py olm-check --ocp 4.21 --operators '[5 operators]'           →  2 found, 3 unavailable

🤖 Generated with Claude Code

@harche

harche commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

/cc @wking

@openshift-ci openshift-ci Bot requested a review from wking May 29, 2026 15:33
@harche harche force-pushed the skills/cluster-update-plc-script branch from 7e43631 to eace3b2 Compare May 29, 2026 15:41
@harche

harche commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

/hold improving evals with better data.

@harche harche force-pushed the skills/cluster-update-plc-script branch 4 times, most recently from f01f289 to 81bf078 Compare May 29, 2026 16:38
@harche harche marked this pull request as draft May 29, 2026 17:46
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@harche

harche commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

Converting to draft, I need to make a few more changes before it is ready for the review.

@harche harche force-pushed the skills/cluster-update-plc-script branch from 81bf078 to 0544f37 Compare May 29, 2026 17:53
@harche harche marked this pull request as ready for review May 29, 2026 18:38
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@harche

harche commented May 29, 2026

Copy link
Copy Markdown
Contributor Author

Converting to draft, I need to make a few more changes before it is ready for the review.

back to open.

@openshift-ci openshift-ci Bot requested review from Cali0707 and PratikMahajan May 29, 2026 18:38
@harche harche force-pushed the skills/cluster-update-plc-script branch from 0544f37 to ef3bf25 Compare June 1, 2026 20:38
@harche

harche commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

Cross-PR testing

Tested end-to-end with:

All three deployed together on OCP 4.21.5 cluster. The product-lifecycle skill (plc_lookup.py) successfully runs inside the agent sandbox, querying the PLCC API and populating operator lifecycle data in the ota_olm_operator_status component.

The skill path fix (cluster-update/...product-lifecycle/...) is required for the sandbox mount paths to resolve correctly.

@harche

harche commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

/override ci/prow/eval

@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown

@harche: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

expected:
decision: "caution"
blockers_found: 0
degraded_operator_detected: true

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how specific do we want to make these eval checks? E.g. you currently have a boolean here. But you could raise the bar a bit for the AI engine being evaluated by requiring an array of degraded operator names. Or further by requiring an array of degraded operator names and propogation of the reason and message (OAuthFlaky and oauth pods intermittently failing for the authentication information in the fixture prompt, for example). Or you could raise the bar even higher by requiring AI to return some suggested next-step that goes beyond the information we already knew when generating the input query. Is the goal here smoke-testing, or high-fidelity spec -> status plumbing, or AI-adds-value-beyond-what-the-classical-bots-already-knew, or what?


def _run_cli(self, *args):
result = subprocess.run(
[sys.executable, SCRIPT_PATH, *args],

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you're already importing plc_lookup to feed tests like self.assertEqual(plc_lookup.normalize_status("Full Support"), "supported"). We could reroll plc_lookup to replace the current main() and args = parser.parse_args() with:

def main(args=None):
    # ...
    args = parser.parse_args(args=args, output=sys.stdout)
    # ...

so this test harness can call:

output = io.StringIO()
plc_lookup.main(args=['products', 'logging for Red Hat OpenShift'], output=io)

or some such, without needing to bother with SCRIPT_PATH magic or subprocess execution.

Doesn't need to happen in this pull though, or maybe ever. I'm just floating as a potential simplification that might make future maintenance easier, and we can always come back to this in the future if SCRIPT_PATH gives us trouble.


def test_unknown_status(self):
self.assertEqual(plc_lookup.normalize_status("Something New"), "unknown")
self.assertEqual(plc_lookup.normalize_status(""), "unknown")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like this risks us conflating two unknown statuses? Maybe normalize_status should opaquely pass through any unrecognized status strings?

def test_case_insensitive(self):
version = SAMPLE_PRODUCT["versions"][0]
result = plc_lookup.extract_phase_date(version, "general availability")
self.assertIsNotNone(result)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, I'd have expected the case-insensitive lookup to return the same result as the General availability test a few cases earlier. Are we asserting that phase casing is important for some reason?


class TestFormatProductVersion(unittest.TestCase):
def test_without_target(self):
result = plc_lookup.format_product_version(SAMPLE_PRODUCT, SAMPLE_PRODUCT["versions"][0])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's a bit strange to me to have two arguments where the later is a subset of the data in the former. It seems like we'd reduce the space for inconsistencies if we had one argument for the whole product structure, and the second argument with just the information needed to look up the version we wanted formatted (and not the entire version structure we wanted formatted). Like format_product_version(product=SAMPLE_PRODUCT, version_index=0) or format_product_version(product=SAMPLE_PRODUCT, version_name=SAMPLE_PRODUCT['versions'][0].name) or some such. No need to block merging over this, just pointing it out in case it makes sense to you, or in case someone else wants to reroll to use that approach in the future.

items = list(range(10))
page, meta = plc_lookup.paginate(items, limit=3, offset=3)
self.assertEqual(page, [3, 4, 5])
self.assertEqual(meta["next_offset"], 6)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: do we want to assert total and returned here too, like we did for test_limit and earlier? Seems easy to just continue that pattern on? Or if we want to make it even harder to forget to check some aspect of the response, we can pivot all of these to self.assertEqual(meta, $FIXME_HAND_CRAFT_ENTIRE_META_DICT) so the tests fail if meta includes a new key we forgot to add test-coverage for.

self.assertEqual(ret, 0)
self.assertEqual(output["total"], 2)
self.assertEqual(output["returned"], 2)
self.assertEqual(output["results"][0]["version"], "6.5")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably worth an self.assertEqual(output['results'][0]['product'], 'logging for Red Hat OpenShift') or some such here too, to confirm that the result we're looking at is the expected SAMPLE_PRODUCT entry without having to rely on the 6.5 version. Maybe drop the 6.5 check and use:

for result in output['results']:
    self.assertEqual(result['product'], 'logging for Red Hat OpenShift')`

since what we're trying to excercise in this test_found case is the name-based lookup?

self.assertEqual(ret, 0)
self.assertEqual(output["operators_checked"], 1)
self.assertEqual(output["lifecycle_unavailable"], [])
self.assertGreater(len(output["results"]), 0)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why assertGreater here instead of being more clear on the specifics of the results we expect? We should be able to assertEqual the entire output string, right?

self.assertEqual(r.returncode, 0, r.stderr)
output = json.loads(r.stdout)
self.assertGreater(output["total"], 0)
self.assertEqual(output["results"][0]["package"], "cluster-logging")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we're allowing >0 results, shouldn't we be iterating over results to confirm that at least one of the results is cluster-logging? That way this test keeps passing if the lifecycle data picks up some other product that matches logging but which isn't the OLM-installed operator we're looking for and that other product happens to come first.

self.assertEqual(r.returncode, 0, r.stderr)
output = json.loads(r.stdout)
names = {entry["product"] for entry in output["results"]}
self.assertTrue(any("OpenShift" in n for n in names))

@wking wking Jun 24, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=Openshift+Container+Platform' | jq -c '.data[] | {name, former_names}'
{"name":"Red Hat OpenShift Container Platform","former_names":["OpenShift Container Platform"," OpenShift Container Platform 3"," OpenShift Container Platform 4"]}

so at the moment, only the one hit. Can we assert that at least one entry has name or a former_names entry that exactly matches Red Hat OpenShift Container Platform instead of using the weaker OpenShift search? Something like:

names = []
for result in output['results']:
    names.append(result.get('product'))
    names.extend(results.get('former_names', [])
self.assertIn('Red Hat OpenShift Container Platform', names)

or are we not preserving former_names through to the results structure?

self.assertEqual(r.returncode, 0, r.stderr)
output = json.loads(r.stdout)
compatible = [e for e in output["results"] if e.get("ocp_compatible")]
self.assertEqual(len(compatible), 0, "No logging version should be compatible with OCP 3.11")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: slightly easier to debug this failing if we include some hints about the surprisingly-compatible versions, like:

self.assertEqual(compatible, [], 'No logging version should be compatible with OCP 3.11')

def test_subcommand_help(self):
for cmd in ["products", "olm-check"]:
r = self._run_cli(cmd, "-h")
self.assertEqual(r.returncode, 0, f"{cmd} -h failed")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we want to assert something useful about the content of the subcommand help, we could use something like:

self.assertIn('Max results to return (0 = all, default: all)', r.stdout)

@@ -0,0 +1,190 @@
#!/usr/bin/env python3

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not aware of a commitment from OLS that whatever image they use to back their sandbox Pod will always have python3. But 🤷 , that's what end-to-end testing is for, wherever that is happening.



def normalize_status(raw_type):
return STATUS_MAP.get(raw_type, "unknown")

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not clear on the value of a 1:1 mapping. See here on my concerns around the unknown default vs. passing through the raw type. And checking the production API now, there doesn't seems to be much diversity like case-inconsistency to collapse:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -r '.data[].versions[].type' | sort | uniq -c | sort -n
     12 
     43 End of Maintenance
    116 Extended Support
    173 Maintenance Support
    263 Full Support
    891 End of life

Can we use the raw types, instead of inventing our own strings? Or are we concerned that the upstream consistency will break down over time, or that the upstream API will introduce other new strings that we want to collapse into a smaller set of our strings? Or...?

"ga_date": extract_phase_date(version, "General availability"),
"full_support_end": extract_phase_date(version, "Full support"),
"maintenance_end": extract_phase_date(version, "Maintenance support"),
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't we need the other phases too, e.g. Extended update support and Extended update support Term 2?:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=fence+agents+remediation' | jq -c '.data[].versions[0].phases[]'
{"name":"General availability","start_date":"N/A","end_date":"2026-06-16T00:00:00.000Z","start_date_format":"string","end_date_format":"date","additional_text":""}
{"name":"Full support","start_date":"2026-06-16T00:00:00.000Z","end_date":"2026-12-31T00:00:00.000Z","start_date_format":"date","end_date_format":"date","additional_text":""}
{"name":"Maintenance support","start_date":"2027-01-01T00:00:00.000Z","end_date":"2027-12-31T00:00:00.000Z","start_date_format":"date","end_date_format":"date","additional_text":""}
{"name":"Extended update support","start_date":"2028-01-01T00:00:00.000Z","end_date":"2028-06-30T00:00:00.000Z","start_date_format":"date","end_date_format":"date","additional_text":""}
{"name":"Extended update support Term 2","start_date":"2028-07-01T00:00:00.000Z","end_date":"2029-06-30T00:00:00.000Z","start_date_format":"date","end_date_format":"date","additional_text":""}


def paginate(results, limit, offset):
total = len(results)
start = min(offset, total)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm not all that clear on https://access.redhat.com/product-life-cycles/api/v2/products order commitments (do they commit to stable ordering? Do they commit to any particular order like "alphabetical by current product name"? What happens if a product gets renamed? Does the caller here really care about limiting results for pagination? Do we have a situation where we'd recommend the consuming AI agent actually paginate the results? Maybe we can just drop the pagination feature as something with unclear value, return all the results every time, and recommend in the skill Markdown that the agent try to be specific with their queries? Or maybe we can drop --offset, keep --limit, and explain in the skill that if the response shows --limit mattered, that the AI should look at the returned product names and try to use that context to try again with a more specific name for the products it was hoping to retrieve?

# Cluster Update Advisor

## Purpose
## 1. Purpose

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is restoring the numbered section headings I'd dropped in b3b6cd2 (#21). Do you not buy my motivation for that change?

product-lifecycle skills for deeper analysis.
Produce structured risk report

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is restoring the triple-backticked block I'd converted to Markdown sections in b3b6cd2 (#21). Do you not buy my motivation for that change?

operators = json.loads(args.operators)
target = args.ocp

batch = api_search("OpenShift")

@wking wking Jun 24, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, this seems like a risky way to try and find the subset of products that includes package. For example:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -c '.data[] | select(.package) | {name, former_names, package}' | grep -v OpenShift
{"name":"AMQ broker","former_names":["Red Hat AMQ","Red Hat AMQ Broker"],"package":"amq-broker-rhel8"}
{"name":"AMQ Broker Operator (RHEL8)","former_names":[],"package":"amq-broker-rhel8"}
{"name":"AMQ Broker Operator (RHEL9)","former_names":[],"package":"amq-broker-rhel9"}
{"name":"AMQ interconnect","former_names":["Red Hat AMQ Interconnect"],"package":"amq7-interconnect-operator"}
...
$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -c '.data[] | select(.package) | {name, former_names, package}' | grep -v OpenShift | wc -l
122

vs. only 9 with the name=OpenShift product filter:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=OpenShift' | jq -c '.data[] | select(.package) | {name, former_names, package}'{"name":"builds for Red Hat OpenShift","former_names":["Builds for Red Hat OpenShift"],"package":"openshift-builds-operator"}
{"name":"custom metrics autoscaler","former_names":["Custom Metric Autoscaler operator for Red Hat OpenShift"],"package":"openshift-custom-metrics-autoscaler-operator"}
{"name":"kernel module management","former_names":["Kernel Module Management operator for Red Hat OpenShift"],"package":"kernel-module-management"}
{"name":"logging for Red Hat OpenShift","former_names":["Red Hat OpenShift Logging"],"package":"cluster-logging"}
{"name":"OpenShift APIs for Data Protection","former_names":["OpenShift API for Data Protection"],"package":"redhat-oadp-operator"}
{"name":"Red Hat OpenShift AI Self-Managed","former_names":["Red Hat OpenShift Data Science self-managed"],"package":"rhods-operator"}
{"name":"Red Hat OpenShift Data Foundation","former_names":["OpenShift Container Storage 4","Red Hat OpenShift Data Foundation (formerly OpenShift Container Storage 4)"],"package":"odf-operator"}
{"name":"Red Hat OpenShift intelligent assistant","former_names":["Red Hat OpenShift Lightspeed"],"package":"lightspeed-operator"}
{"name":"Red Hat OpenShift support for Windows Containers","former_names":["Red Hat OpenShift support for Windows Containers","Red Hat OpenShift for Windows Containers"],"package":"windows-machine-config-operator"}
{"name":"web terminal operator","former_names":["Red Hat OpenShift Web Terminal Operator","Web Terminal Operator"],"package":"web-terminal"}

for p in batch:
pkg = p.get("package")
if pkg:
by_package[pkg] = p

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we aren't often moving packages from one product to another, but I'm not sure it's impossible. For compat check purposes, we probably want to preserve every product that mentions the package we're interested in. You can use collections.defaultdict for this, something like:

by_package = collections.defaultdict(list)
for p in batch:
        pkg = p.get("package")
        if pkg:
            by_package[pkg].append(p)

after which by_package.get(package_name) would give you a list of all the products claiming that package, or None.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and looks like we do need this already, e.g.:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -r '.data[].package | select(.)' | sort | uniq -c | sort -n | tail
      1 trustee-operator
      1 vertical-pod-autoscaler
      1 volsync-product
      1 watcher-operator
      1 web-terminal
      1 windows-machine-config-operator
      2 amq7-interconnect-operator
      2 amq-broker-rhel8
      2 amq-streams
      2 skupper-operator
$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -c '.data[] | select(.package == "skupper-operator") | {name, former_names}'
{"name":"Red Hat Service Interconnect","former_names":[]}
{"name":"Red Hat Service Interconnect Operator","former_names":[]}

Happily for at least Red Hat Service Interconnect*, there doesn't seem to be any difference in phase information:

$ diff -u1 <(curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=Red+Hat+Service+Interconnect' | jq -c '.data[] | .versions[] | .name as $v | .phases[] | {version: $v, phase: .}') <(curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=Red+Hat+Service+Interconnect+Operator' | jq -c '.data[] | .versions[] | .name as $v | .phases[] | {version: $v, phase: .}')

I haven't checked the others.

product = by_package.get(pkg)

if not product:
extra = api_search(pkg)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it's trying to backstop for the earlier OpenShift query missing the product(s) that ships this package. But it's not a complete guard, e.g. it fails for the skupper-operator package:

$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=skupper-operator'
{"data":[]}
$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products?name=OpenShift' | jq -c '[.data[] | select(.package == "skupper-operator")]'
[]
$ curl -s 'https://access.redhat.com/product-life-cycles/api/v2/products' | jq -c '[.data[] | select(.package == "skupper-operator").name]'
["Red Hat Service Interconnect","Red Hat Service Interconnect Operator"]

---
name: product-lifecycle
description: Query Red Hat Product Life Cycle data for support phases, end-of-life dates, and OpenShift version compatibility. Use when evaluating whether installed operators or layered products are supported on a given OCP version, approaching end of life, or need upgrading before a cluster upgrade. Also use when the user asks about product support status, EOL dates, or lifecycle phases for any Red Hat product.
allowed-tools: Bash(python3:*)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our previous discussion ended up deciding allowed-tools was too under-specified to be worth including. Doesn't look like the spec has evolved since then.

# Red Hat Product Life Cycle

Query the Red Hat Product Life Cycle API to check support status, EOL dates, and OpenShift compatibility for Red Hat products and layered operators.
Query the Red Hat Product Life Cycle API (v2) to check support status, EOL

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this skill doesn't have to care that the tool queries the v2 API, so we can probably drop the change to this line. One less place that will need bumping if we eventually pivot the tool to a v3 API or whatever.

#### `products` — Query products by name

### Check support status for a specific product version
Maps directly to `GET /v2/products?name=<name>`.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal tool detail, right? I expect dropping this line would save the AI agent a few tokens parsing the skill, and not impact its ability to use the tool at all. And I expect dropping this line will save us the effort of having to remember to bump this line if the tool behavior changes in the future.

}
]
}
python3 product-lifecycle/scripts/plc_lookup.py -h

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we drop python3 here, set the executable bit on plc_lookup.py, and rely on the #!/usr/bin/env python3 shebang to get the script invoked by its intended interpreter? Or is there something about volume mounting the skills image into the sandbox container that messes with executable bits?

- Current and target version metadata
- Channel and update path information
- **Cluster readiness JSON** — cluster health checks with context relevant to preparing for the update
- **Cluster readiness JSON** — pre-collected by CVO with results from 9 parallel checks

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is restoring the commitment to explicit CVO behavior (calling for 9 exact checks) that we'd dropped after this earlier thread. Is my argument there making less sense now than it did then? 😅

"total_checks": 9,
"checks_ok": 9,
"checks_errored": 0,
"elapsed_seconds": 0.65

@wking wking Jun 24, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is restoring the room for inconsistent claims that we'd dropped after this earlier thread. Is my argument there making less sense now than it did then? 😅

check-specific data with a `summary` section for quick parsing.

## Evaluation
### What the checks cover

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is restoring the section committing to CVO behavior that seems hard to maintain, and which we'd dropped after this earlier thread. Is my argument there making less sense now than it did then? 😅

Any check with `_status` `error` represents a gap in visibility.
Note incomplete areas — they reduce confidence.
- **`prometheus`** — if `etcd_health` shows degraded conditions, query
`etcd_disk_backend_commit_duration_seconds` for trends

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd dropped this section after this earlier thread, and also have the Investigate with other skills section a bit further down that you're currently dropping. Why pivot back to the old approach?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants