Skip to content

Comments

Fix parallelSync getting stuck when all workers fail#399

Merged
n8mgr merged 5 commits intomasterfrom
chris/fix-stuck-sync
Feb 18, 2026
Merged

Fix parallelSync getting stuck when all workers fail#399
n8mgr merged 5 commits intomasterfrom
chris/fix-stuck-sync

Conversation

@ChrisSchinnerl
Copy link
Member

@ChrisSchinnerl ChrisSchinnerl commented Feb 18, 2026

If for some reason all workers in parallelSync fail, it will never exit.

Result of failing reproduction test here: https://github.com/SiaFoundation/coreutils/actions/runs/22136567973/job/63989707212?pr=399

@ChrisSchinnerl ChrisSchinnerl self-assigned this Feb 18, 2026
@github-project-automation github-project-automation bot moved this to In Progress in Sia Feb 18, 2026
@ChrisSchinnerl ChrisSchinnerl force-pushed the chris/fix-stuck-sync branch 2 times, most recently from 71f30c9 to 8567592 Compare February 18, 2026 10:45
@ChrisSchinnerl ChrisSchinnerl marked this pull request as ready for review February 18, 2026 10:54
Copilot AI review requested due to automatic review settings February 18, 2026 10:54
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a deadlock scenario in the syncer’s parallelSync orchestration where syncing could stall indefinitely if every worker peer fails during block fetching, and adds a regression test plus a changeset entry.

Changes:

  • Track active parallelSync workers and abort with an error if no workers remain while requests are still incomplete.
  • Add TestParallelSyncStall to reproduce the “all workers fail” scenario using a peer that serves headers but always fails block RPCs.
  • Add a knope changeset documenting the patch.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
syncer/parallel_sync.go Adds active-worker tracking and an early exit to prevent indefinite blocking when no peers can make progress.
syncer/syncer_test.go Adds a regression test that simulates a peer stalling during block fetch and verifies the sync loop continues.
.changeset/fix_parallelsync_stalling_when_all_workers_fail.md Documents the bugfix as a patch changeset.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 020acaec35

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@n8mgr n8mgr force-pushed the chris/fix-stuck-sync branch from 85edbf4 to 13dabd4 Compare February 18, 2026 21:07
@n8mgr n8mgr merged commit 16ee015 into master Feb 18, 2026
13 checks passed
@n8mgr n8mgr deleted the chris/fix-stuck-sync branch February 18, 2026 21:09
@github-project-automation github-project-automation bot moved this from In Progress to Done in Sia Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants