Skip to content

Fix Flaky e2e tests - Update the Cleanup order#31347

Open
YamunadeviShanmugam wants to merge 1 commit into
openshift:mainfrom
YamunadeviShanmugam:improve_test_steps_client_go
Open

Fix Flaky e2e tests - Update the Cleanup order#31347
YamunadeviShanmugam wants to merge 1 commit into
openshift:mainfrom
YamunadeviShanmugam:improve_test_steps_client_go

Conversation

@YamunadeviShanmugam

@YamunadeviShanmugam YamunadeviShanmugam commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Fix resourcesToDelete accumulation and premature OAuth token deletion in CLI

This PR fixes two related bugs in test/extended/util/client.go that affect test reliability when multiple tests share a CLI instance.

Bug 1 — resourcesToDelete never cleared after TeardownProject

TeardownProject iterates over c.resourcesToDelete to delete cluster resources at the end of each test, but the slice was never reset. Because a single *CLI instance is shared across all It blocks in a Describe, the slice accumulated every resource registered across the entire suite lifetime. Each subsequent test's TeardownProject would redundantly attempt to delete resources from all prior tests, generating 404 noise in logs and performing unnecessary API calls to the cluster.

Fix:

Reset c.resourcesToDelete to nil at the end of TeardownProject. The slice is always rebuilt from scratch via append when the next test registers resources.

Bug 2 — OAuthAccessToken deleted before framework DeferCleanup runs

GetClientConfigForUser created an OAuthAccessToken and registered it in resourcesToDelete. This meant it was deleted during TeardownProject, which is registered as a g.AfterEach. The Kubernetes framework's namespace cleanup, however, runs via ginkgo.DeferCleanup (registered inside f.BeforeEach), and the documented execution order is:

It block
→ all AfterEach nodes (including TeardownProject) ← token deleted here
→ all DeferCleanups (LIFO) ← framework cleanup here
→ f.AfterEach (namespace deletion)
This meant the bearer token embedded in the returned *rest.Config became invalid before any DeferCleanup callbacks had a chance to run. Any deferred cleanup that tried to make authenticated API calls as that user would receive 401 Unauthorized.

Fix:

Remove the token from resourcesToDelete and instead set ExpiresIn: 86400 (24 hours). The associated User object is still explicitly deleted in TeardownProject, which is sufficient to revoke effective access. The ExpiresIn TTL ensures the token is eventually cleaned up by the OpenShift OAuth token controller without needing explicit deletion.

Latent OIDC test bug fixed as a consequence

test/extended/authentication/oidc.go calls GetClientConfigForUser in a g.BeforeAll and stores the result in oauthUserConfig. This config is then used across two separate ordered It blocks:

"should not accept tokens provided by the OAuth server" (expects Unauthorized)
"should accept tokens provided by the OpenShift OAuth server" (expects 200 OK)
With the old code, TeardownProject deleted the OAuthAccessToken after the first It block. The first test passed for the wrong reason (deleted token → Unauthorized, which matched the expectation, but for the wrong cause). The second test silently used an invalid bearer token. As the current fix keeps the token alive across all ordered It blocks as intended, so both tests validate the correct behavior.

Summary by CodeRabbit

  • Bug Fixes
    • Improved test cleanup so resources aren’t accidentally carried over between runs.
    • Updated temporary login token handling to expire automatically, reducing leftover authentication data and repeat cleanup attempts.

  Improve the cleanup order for fixing flakiness
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 994ad85f-ab63-481a-a6bf-9ed39e78c7dc

📥 Commits

Reviewing files that changed from the base of the PR and between 6df9cfe and 74d2be9.

📒 Files selected for processing (1)
  • test/extended/util/client.go

Walkthrough

In test/extended/util/client.go, TeardownProject now resets c.resourcesToDelete to nil after deletion. GetClientConfigForUser now creates OAuthAccessToken with an ExpiresIn TTL and no longer registers the token in resourcesToDelete.

Changes

CLI Resource Lifecycle Fixes

Layer / File(s) Summary
TeardownProject reset and OAuthAccessToken TTL
test/extended/util/client.go
TeardownProject clears c.resourcesToDelete to nil after deletions. GetClientConfigForUser sets ExpiresIn on the created OAuthAccessToken and removes the explicit resourcesToDelete registration for that token.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title matches the main change: adjusting cleanup order/behavior to fix flaky e2e tests.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The PR only changes cleanup logic in util/client.go; the affected OIDC tests use static string-literal titles, with no dynamic Ginkgo name construction.
Test Structure And Quality ✅ Passed The touched test helper and OIDC callers keep clear setup/teardown, single-purpose It blocks, and all cluster waits use explicit timeouts.
Microshift Test Compatibility ✅ Passed PR only changes cleanup helper logic in test/extended/util/client.go; no new Ginkgo tests or new unsupported MicroShift API usages were added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The PR only edits test/extended/util/client.go helpers; no new Ginkgo test cases or SNO-sensitive assumptions were added.
Topology-Aware Scheduling Compatibility ✅ Passed PASS: The PR only changes test utility cleanup in test/extended/util/client.go; no manifests, controllers, or scheduling constraints were added or modified.
Ote Binary Stdout Contract ✅ Passed PASS: client.go only updates teardown/token cleanup; scans of process-level setup found no stdout writes, only stderr logging.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Only a shared test helper changed; no new Ginkgo specs were added and the touched code introduces no IPv4-specific or external-connectivity assumptions.
No-Weak-Crypto ✅ Passed PASS: The PR only changes teardown and OAuth token TTL logic; no weak ciphers, custom crypto, or secret/token comparisons were introduced.
Container-Privileges ✅ Passed PR only changes Go test helper code; no container/K8s manifests were modified, and no privileged/host* or allowPrivilegeEscalation settings appear in the touched file.
No-Sensitive-Data-In-Logs ✅ Passed No new logging of secrets/tokens/PII was added; the PR removes token cleanup and existing command logs already redact bearer tokens.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci openshift-ci Bot requested review from deads2k and sjenning June 29, 2026 08:24
@gangwgr

gangwgr commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: gangwgr, YamunadeviShanmugam
Once this PR has been reviewed and has the lgtm label, please assign bertinatto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 29, 2026
@openshift-ci openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 29, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@YamunadeviShanmugam

Copy link
Copy Markdown
Contributor Author

/retest

@openshift-ci

openshift-ci Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

@YamunadeviShanmugam: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-microshift 74d2be9 link true /test e2e-aws-ovn-microshift

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants