Skip to content

feat(backend, config): S3 garbage collection without race condition#6678

Open
anna-parker wants to merge 62 commits into
mainfrom
gc-anya
Open

feat(backend, config): S3 garbage collection without race condition#6678
anna-parker wants to merge 62 commits into
mainfrom
gc-anya

Conversation

@anna-parker

@anna-parker anna-parker commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

builds on #6543 with modifications to prevent a race condition and additional code improvements, tests were modified to use existing helper functions used elsewhere in the codebase

resolves #3935
resolves #6543 (comment)

Config

Requires #6705 to be merged before GC can be enabled on instances with multiple backend replicas, the default config parameters will be applied without user changes:

s3:
  enabled: true
  gcInitialDelayMinutes: 15 # initial delay before first garbage collection task runs
  gcEnabled: false # have garbage collection log instead of actually delete files
  gcFrequencyMinutes: 1440 # frequency of garbage collection polling in minutes
  orphanRetentionPeriodMinutes: 7

The Exact Race Window

The current code has two separate steps with no lock between them:

  GC:         getOrphanedFileIds() ──────────────────── deleteFile(F) + deleteFileEntry(F)
                    ↑                                          ↑
  User:             │  validateFileIdsExist(F) → submit() ────┤
                    │  (passes, F still in DB)                │
                    └──────────────────────────────── F deleted, but submission committed

validateFileIdsExist checks the DB, then the submission writes to sequence_entries later. Between those two events, GC
can delete the file. The window is roughly: time to delete all the S3 objects + DB rows in the batch.

Additional improvements

  • make maxOrphanAge minutes and rename as orphanRetentionPeriod
  • make the polling frequency of the gc task configurable for integration testing

Alternatives considered

  • Have one transaction for finding the files to delete and deleting them, this transaction would need to apply table locks/isolation to prevent the file validation function (called when submitting) to read a file as existing while we are deleting it (getting into the same issue as above). We additionally tested that just reading all the jsonb in ppx takes 3min, this transaction time will only grow once we add actual files. Locking tables while doing this transaction is not feasible.

Unresolved Issues

As discussed with @corneliusroemer async there is still a very small window between when the files are validated and are uploaded to the db during a submission. When only one backend is running this can be ignored, however on PPX we have two backends if both start this job simultaneously we can have a submission occur during deletion and have one backend attempt to delete a file within this interval. Leading to the issue we are attempting to resolve still occuring.

We should probably prevent multiple garbage collection tasks from running simultaneously to prevent this from occuring. The alternative is to only delete files a certain time period after they have been marked as ready for deletion - this would prevent this race condition.

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by appropriate, automated tests.
  • Any manual testing that has been done is documented (i.e. what exactly was tested?)

🚀 Preview: Add preview label to enable

@maverbiest

maverbiest commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Tanks a lot for this! Definitely better to guard against this :)

Do I understand correctly that in the current setup, the scan of all se and sepd is done twice per GC task? I'm wondering if it's not possible to modify the query to return file_id, upload_requested_at, and marked_for_deletion_at at the same time, and then figure out in kotlin what files need to be deleted and which to be marked for deletion. That way you'd only have to do the heavy SQL work once. Or is there something I'm missing that would make that impractical?

@anna-parker

Copy link
Copy Markdown
Contributor Author

@maverbiest I think I covered this option in alternatives considered in the PR description but let me know if you think I missed sth or it is unclear :-)

@maverbiest

Copy link
Copy Markdown
Contributor

@anna-parker hmm if I understand the description of the alternative correctly that is about having one query that both finds and deletes. That's not what I was thinking but I probably didn't explain very well; what I meant is to have one query to find unreferenced files, but instead of it just returning file_id, also have it return upload_requested_at, and marked_for_deletion_at. Something like:

  val unreferenced = filesDatabaseService.getUnreferencedFiles()
  val toDelete = unreferenced.filter { it.markedForDeletionAt != null }.map { it.id }
  val toMark   = unreferenced.filter { it.markedForDeletionAt == null && it.uploadRequestedAt < threshold }.map { it.id }

Then run another query to delete what files in toDelete, and then another query to mark the files in toMark. The first expensive query is still just a SELECT so doesn't lock the DB, the other two queries go by indexed columns so would be fast & not lock the DB for long.

@anna-parker

Copy link
Copy Markdown
Contributor Author

ah yes that would work! its a bit more custom code I guess but doable!

@anna-parker anna-parker changed the title feat(backend, config)!: add code to prevent race condition from ever putting the db into a weird state feat(backend, config)!: S3 garbage collection without race condition Jun 18, 2026
@anna-parker anna-parker changed the title feat(backend, config)!: S3 garbage collection without race condition feat(backend, config): S3 garbage collection without race condition Jun 18, 2026
@maverbiest

Copy link
Copy Markdown
Contributor

I was wondering if the plan moving forward is to get this PR ready to merge, or to incorporate these improvements into #6543?

I'm good either way, but if we're focusing on this PR I'll probably close my original one to avoid confusion & double work :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could rename this file since we now also add a column in addition to an index

}
}

fun getMarkedForDeletionFileIds(fileIds: Set<FileId>): Set<FileId> = fileIds.chunkedForDatabase({ chunk ->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking but could we add a short docstring here explaining that the function filters a set of input FileIds (similar to getUncheckedFileIds below)? Upon first reading the function name I expected it to get the FileIds of all files marked for deletion in the DB

val markedOrphans = orphans.filterValues { it != null }.keys
val newOrphans = orphans.filterValues { it == null }.keys

// Phase 2: delete files that were marked in a previous run and are still unreferenced

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Phase 2: delete files that were marked in a previous run and are still unreferenced
// Phase 2: delete files that were marked in a previous run

Maybe remove this part? There is no way for a file to be 'rescued' once it's been marked for deletion, right? Or is there?

Comment on lines +72 to +77
if (newOrphans.isNotEmpty()) {
filesDatabaseService.markFilesForDeletion(newOrphans)
log.info { "S3 garbage collection task marked ${newOrphans.size} orphan(s) for deletion" }
} else {
log.info { "S3 garbage collection task identified no new orphan files" }
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'm wondering here is whether marking files for deletion should be gated by enabled as well?

If I understand the current approach correctly, even when we disable the GC files will get marked for deletion which means they will cause submissions to be rejected. Not blocking in this one way or the other but might be worth flagging in a comment at least

)
if (deleteFailures > 0) {
log.warn {
"S3 garbage collection task unsuccessfully attempted to delete $deleteFailures orphan file(s)"

@maverbiest maverbiest Jun 23, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could keep track of s3DeleteFailures and dbDeleteFailures separately?

gcInitialDelayMinutes: 15 # initial delay before first garbage collection task runs
gcEnabled: false # have garbage collection log instead of actually delete files
gcFrequencyMinutes: 1440 # frequency of garbage collection polling in minutes
orphanRetentionPeriodMinutes: 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
orphanRetentionPeriodMinutes: 7
orphanRetentionPeriodMinutes: 10080

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer this to also carry the gc prefix in line with the other gc-related config . We could do gcOrphanRetentionPeriodMinutes (or maybe gcGracePeriodMinutes)?

private val dateProvider: DateProvider,
private val auditLogger: AuditLogger,
@Value("\${${BackendSpringProperty.S3_ORPHAN_RETENTION_PERIOD_MINUTES}}") private val orphanRetentionPeriod: Int,
@Value("\${${BackendSpringProperty.S3_GC_ENABLED}:false}") private val enabled: Boolean = false,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it confusing that we set the default value twice in one line. It's not obvious to me what would happen if they ever get out of sync. Then there's also a third default value in application.json (which actually defaults to true for enabled instead of false).

Could we keep all defaults in application.json so they all live in one place? Also for the initial delay and the frequency?


private var groupId = 0

private fun createProcessedSubmissionReferencingFile(file: UUID) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a lot of overlap with the second test in GetOrphanedFileIdsTest.kt, could this be extracted and reused? Or would that be too much of a hassle since they're slightly different?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend related to the loculus backend component deployment Code changes targetting the deployment infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Build a garbage collector for file sharing

4 participants