Skip to content

fix(macrobenchmarks): retry GKE cluster deletion on failure#943

Merged
zhixiangli merged 1 commit into
fsspec:mainfrom
zhixiangli:fix-cleanup-cluster
Jun 30, 2026
Merged

fix(macrobenchmarks): retry GKE cluster deletion on failure#943
zhixiangli merged 1 commit into
fsspec:mainfrom
zhixiangli:fix-cleanup-cluster

Conversation

@zhixiangli

Copy link
Copy Markdown
Collaborator

Transient failures (e.g. FAILED_PRECONDITION) can occur during GKE cluster deletion. Retry the deletion up to 20 times with a 30-second delay to ensure resources are cleaned up.

Transient failures (e.g. FAILED_PRECONDITION) can occur during GKE
cluster deletion. Retry the deletion up to 20 times with a 30-second
delay to ensure resources are cleaned up.

TAG=agy
CONV=efac437f-9cb0-4057-a8f9-02a4b9bf8b97

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the GKE cluster cleanup script to retry deleting the cluster up to 20 times with a 30-second delay between attempts to handle transient failures. The reviewer pointed out that if the cluster does not exist, this retry loop will cause an unnecessary 10-minute delay, and suggested capturing the command output to break early if a "not found" error is encountered.

Comment thread cloudbuild/macrobenchmarks/scripts/cleanup_cluster.sh
@zhixiangli zhixiangli changed the title fix(cleanup): retry GKE cluster deletion on failure fix(macrobenchmarks): retry GKE cluster deletion on failure Jun 30, 2026
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.68%. Comparing base (8f47e73) to head (5e69df2).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #943   +/-   ##
=======================================
  Coverage   89.68%   89.68%           
=======================================
  Files          16       16           
  Lines        3579     3579           
=======================================
  Hits         3210     3210           
  Misses        369      369           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@zhixiangli zhixiangli merged commit 1b08827 into fsspec:main Jun 30, 2026
10 checks passed
@zhixiangli zhixiangli deleted the fix-cleanup-cluster branch June 30, 2026 06:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants