Skip to content

Conversation

@richarddushime
Copy link
Contributor

Description

Fixes #542
an automated workflow to keep the repository clean by removing old and stale branches.

Staging Aggregate Branches: Keeps only the latest 10 staging-aggregate-* branches.
Merged Branches: Deletes any branch that has been merged into master (excluding protected branches main, staging etc.).
Stale Branches: Deletes any branch with no activity for 6 months.
Safety Check: Branches with OPEN PRs are automatically skipped, even if stale.

Type of Change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • Tested Locally
  • Manual review / previewed on staging.forrt.org content/webpage changes
  • Not Tested yet

Checklist for Developers:

  • I have attempted to stay aligned to related code in this repository rather than reinventing the wheel.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.

Additional Notes

@github-actions github-actions bot added the cicd Relevant to GitHub workflows label Dec 22, 2025
@github-actions
Copy link
Contributor

👍 All image files/references (if any) are in webp format, in line with our policy.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an automated workflow to clean up stale and old branches in the repository to maintain repository hygiene. The workflow runs weekly on Sundays and can also be triggered manually with a dry-run option.

Changes:

  • Adds a new GitHub Actions workflow file that performs three types of branch cleanup: old staging-aggregate branches (keeping only 10 most recent), merged branches (already merged into master), and stale branches (no activity for 6 months with no open PRs)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +35 to +36
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup-staging job keeps 10 staging-aggregate branches, but the staging-aggregate.yaml workflow already cleans up these branches and keeps only the 5 most recent (see lines 136-155 in staging-aggregate.yaml). This creates an inconsistency where:

  1. The staging-aggregate workflow creates a new branch and keeps 5 most recent
  2. This cleanup workflow runs weekly and keeps 10 most recent

This could lead to confusion about how many branches should be retained. Consider aligning the number with the existing staging-aggregate workflow (either both keep 5, or both keep 10) to maintain consistency across the repository.

Suggested change
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
# Get branches matching pattern, sort by committer date (descending), skip first 5
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +6 | sed 's/origin\///')

Copilot uses AI. Check for mistakes.
Comment on lines +76 to +77
# List remote branches merged into origin/master, exclude protected ones
MERGED_BRANCHES=$(git branch -r --merged origin/master | grep -vE "HEAD|$PROTECTED_BRANCHES" | sed 's/origin\///')
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grep pattern uses '-vE "HEAD|$PROTECTED_BRANCHES"' which will also exclude branches containing these strings anywhere in their name (e.g., a branch named 'feature-mastery' would be excluded because it contains 'master'). This could lead to branches being incorrectly protected from deletion. Consider using word boundary anchors or a more precise pattern to match only complete branch names, such as using awk to filter exact matches.

Suggested change
# List remote branches merged into origin/master, exclude protected ones
MERGED_BRANCHES=$(git branch -r --merged origin/master | grep -vE "HEAD|$PROTECTED_BRANCHES" | sed 's/origin\///')
# List remote branches merged into origin/master, exclude HEAD and fully matching protected branches
MERGED_BRANCHES=$(git branch -r --merged origin/master | grep -vE "HEAD|origin/($PROTECTED_BRANCHES)$" | sed 's/origin\///')

Copilot uses AI. Check for mistakes.
echo "Listing staging-aggregate branches..."
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The branch listing pattern 'origin/staging-aggregate-' uses a glob pattern with git branch -r --list, but this might not work as expected in all scenarios. The staging-aggregate.yaml workflow uses git ls-remote with 'refs/heads/staging-aggregate-' for more reliable remote branch listing. Consider using the same approach for consistency and reliability.

Suggested change
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
BRANCHES_TO_DELETE=$(git for-each-ref --format='%(refname:short)' --sort=-committerdate 'refs/remotes/origin/staging-aggregate-*' | tail -n +11 | sed 's#origin/##')

Copilot uses AI. Check for mistakes.
echo "Checking for stale branches..."
# Iterate over all remote branches
git branch -r | grep -vE "HEAD|$PROTECTED_BRANCHES" | while read -r branch; do
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup-stale job does not exclude staging-aggregate-* branches from stale branch detection. Since the cleanup-staging job already handles these branches, stale staging-aggregate branches might be processed by both jobs, causing redundant operations. Consider adding 'staging-aggregate-*' to the grep exclusion pattern or documenting why staging-aggregate branches should also be subject to stale cleanup.

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +55
cleanup-staging:
name: Cleanup Staging Aggregate Branches
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Delete old staging branches
env:
DRY_RUN: ${{ inputs.dry_run }}
run: |
echo "Fetching all branches..."
git fetch --all --prune
echo "Listing staging-aggregate branches..."
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
if [ -z "$BRANCHES_TO_DELETE" ]; then
echo "No old staging branches to delete."
exit 0
fi
echo "Found branches to delete:"
echo "$BRANCHES_TO_DELETE"
for branch in $BRANCHES_TO_DELETE; do
branch=$(echo "$branch" | xargs) # trim whitespace
if [ "$DRY_RUN" = "true" ]; then
echo "[DRY RUN] Would delete: $branch"
else
echo "Deleting: $branch"
git push origin --delete "$branch" || echo "Failed to delete $branch"
fi
done
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The staging-aggregate-* branches are already cleaned up by the staging-aggregate.yaml workflow itself (lines 136-155 of that file), which keeps only the 5 most recent branches. Adding a separate cleanup job here may cause redundant deletions and create a race condition where both workflows try to clean up the same branches. Consider whether this cleanup-staging job is necessary or if it should be removed to avoid duplication.

Suggested change
cleanup-staging:
name: Cleanup Staging Aggregate Branches
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Delete old staging branches
env:
DRY_RUN: ${{ inputs.dry_run }}
run: |
echo "Fetching all branches..."
git fetch --all --prune
echo "Listing staging-aggregate branches..."
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
if [ -z "$BRANCHES_TO_DELETE" ]; then
echo "No old staging branches to delete."
exit 0
fi
echo "Found branches to delete:"
echo "$BRANCHES_TO_DELETE"
for branch in $BRANCHES_TO_DELETE; do
branch=$(echo "$branch" | xargs) # trim whitespace
if [ "$DRY_RUN" = "true" ]; then
echo "[DRY RUN] Would delete: $branch"
else
echo "Deleting: $branch"
git push origin --delete "$branch" || echo "Failed to delete $branch"
fi
done

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +54
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Delete old staging branches
env:
DRY_RUN: ${{ inputs.dry_run }}
run: |
echo "Fetching all branches..."
git fetch --all --prune
echo "Listing staging-aggregate branches..."
# Get branches matching pattern, sort by committer date (descending), skip first 10
BRANCHES_TO_DELETE=$(git branch -r --list 'origin/staging-aggregate-*' --sort=-committerdate | tail -n +11 | sed 's/origin\///')
if [ -z "$BRANCHES_TO_DELETE" ]; then
echo "No old staging branches to delete."
exit 0
fi
echo "Found branches to delete:"
echo "$BRANCHES_TO_DELETE"
for branch in $BRANCHES_TO_DELETE; do
branch=$(echo "$branch" | xargs) # trim whitespace
if [ "$DRY_RUN" = "true" ]; then
echo "[DRY RUN] Would delete: $branch"
else
echo "Deleting: $branch"
git push origin --delete "$branch" || echo "Failed to delete $branch"
fi
done
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow performs git operations (git push) but does not configure Git user identity, which is a common pattern in other workflows in this repository (see staging-aggregate.yaml and data-processing.yml). While git push --delete may work without identity configuration, it's best practice to configure Git identity before performing any git operations. Consider adding a 'Configure Git' step before the deletion operations in each job.

Copilot uses AI. Check for mistakes.
Comment on lines 124 to 149
git branch -r | grep -vE "HEAD|$PROTECTED_BRANCHES" | while read -r branch; do
branch=$(echo "$branch" | xargs) # trim whitespace
clean_branch_name=${branch#origin/}
# Get last commit date for the branch
last_commit_date=$(git log -1 --format=%ct "$branch")
# check if branch has open PR
open_pr_count=$(gh pr list -H "$clean_branch_name" --state open --json number | jq '. | length')
if [ "$open_pr_count" -gt 0 ]; then
echo "Skipping $clean_branch_name (Has open PR)"
continue
fi
if [ "$last_commit_date" -lt "$THRESHOLD_DATE" ]; then
echo "Branch '$clean_branch_name' is stale (Last commit: $(date -d "@$last_commit_date"))"
if [ "$DRY_RUN" = "true" ]; then
echo "[DRY RUN] Would delete: $clean_branch_name"
else
echo "Deleting: $clean_branch_name"
git push origin --delete "$clean_branch_name" || echo "Failed to delete $clean_branch_name"
fi
fi
done
Copy link

Copilot AI Jan 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The while loop that processes stale branches uses 'git push origin --delete' inside a pipeline (git branch -r | ... | while read). If git push fails or has issues reading from stdin, this could cause unexpected behavior because the git push command might consume stdin meant for the loop. Consider collecting branches to delete first into a variable, then deleting them in a separate loop, similar to how the cleanup-staging and cleanup-merged jobs handle this.

Copilot uses AI. Check for mistakes.
LukasWallrich and others added 2 commits January 21, 2026 16:33
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

✅ Spell Check Passed

No spelling issues found in this PR! 🎉

@github-actions
Copy link
Contributor

✅ Spell Check Passed

No spelling issues found in this PR! 🎉

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@github-actions
Copy link
Contributor

✅ Spell Check Passed

No spelling issues found in this PR! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cicd Relevant to GitHub workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clean up branches

3 participants