fix(importer): reconstruct multi-volume RAR sets with reposted/rollover volumes#683
Merged
Conversation
Old-style multi-volume RAR sets roll the extension letter after .r99 (.r99 -> .s00 -> ... -> .z99). AltMount only recognized .rar, .r00-.r99 and .partNN.rar, so SetKey() returned no group key for the rollover volumes and each became its own singleton group. rardecode was then handed a filesystem missing the continuation volumes, could not follow .r99 -> .s00, and truncated the extracted media file's segment map to a single volume — causing streaming to fail with "No segments to download" (available_bytes ~= one volume vs the full expected file size). Teach every RAR-volume recognizer about the r->z + two-digit rollover, mirroring rardecode's nextOldVolName: - archive/rar/utils.go: add RollVolPattern; SetKey groups .sNN-.zNN with the set; rarVolumeNumber returns a contiguous ordinal across the letter boundary (.r99=100, .s00=101) so gap detection does not misfire. - archive/rar/processor.go & archive/sevenzip/processor.go: isRarArchiveFile / parseRarFilename handle rollover volumes. - parser/fileinfo/detector.go, importer/utils/file_extensions.go, api/file_handlers.go: extend the duplicated rarPattern regexes. Regression tests cover IsRarFile, SetKey, rarVolumeNumber, groupHasVolumeGap rollover cases, and a 114-volume set collapsing to a single group.
Some Usenet posts repost a multi-volume RAR set's first volume under a different base name (e.g. movie.repost.part01.rar) while the continuation volumes keep the original base (movie.r00..r99 / .s00..s12, no plain movie.rar). GroupArchivesByBaseName split these by base name, isolating the only volume with a real archive header from its continuations — so rardecode mapped just that one volume (e.g. a single ~100k volume), truncating the extracted media file and breaking streaming with "No segments to download". When exactly one group is a lone first volume and every other group is pure continuations sharing one roll/numeric scheme AND a base that the first volume's base merely extends by a ".<suffix>" (the repost signal), fold them into one set: rename the first volume to <base>.rar and order the continuations by volume number so rardecode follows the whole archive natively. Guards keep genuinely separate archives (unrelated bases, multiple first volumes, multi-volume starters) untouched. Depends on the .sNN rollover recognition in this branch, which is what groups .s00..s12 with .r00..r99 and orders them contiguously.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Streaming an extracted
.mkvfailed with:The release is one physical multi-volume RAR set with two quirks that broke import:
…x264-monkee.repost.part01.rar, while every continuation keeps the original base:…x264-monkee.r00….r99then the old-style rollover…x264-monkee.s00….s12. There is no plain…x264-monkee.rar..r99into.s00….z99, which AltMount didn't recognize at all.GroupArchivesByBaseNamesplit this single set by base name, isolatingpart01.rar(base…monkee.repost) from its continuations (base…monkee). Only thepart01.rargroup had a usable archive header, sorardecodemapped just that one volume = exactly102400000bytes (one-v100000kvolume), truncating the ~11.5 GB.mkv.Confirmed with a deterministic reproduction at the grouping seam (15 groups pre-fix; the
part01.rargroup analyzed alone).Fix
Two composing changes:
.sNN….zNNrollover recognition (utils.goRollVolPattern/rarVolumeNumber,detector.go,file_extensions.go,file_handlers.go, sevenzip): the continuation letter rollsr→s→…→zafter.r99.rarVolumeNumbernow returns a contiguous ordinal across the boundary (.r99=100,.s00=101) so the continuations group and order correctly.utils.goreconcileRepostedFirstVolume, wired intoGroupArchivesByBaseName): when exactly one group is a lone first volume and the others are pure continuations sharing one scheme and a base the first volume's base merely extends by a.<suffix>(the repost signal), fold them into one set, rename the first volume to<base>.rar, and order continuations by volume number.rardecodethen follows the whole archive natively vianextOldVolName.Conservative guards keep genuinely separate archives untouched (unrelated bases, multiple first volumes, multi-volume starters) — verified by the existing
TestGroupArchivesByBaseName(a continuation set + an unrelated single-part archive stays 2 groups).Tests
TestGroupArchivesReconcilesRepostedFirstVolume— the real Zero.Effect shape (114 volumes) collapses to one group with the first volume renamed to<base>.rar.TestReconcileLeavesSeparateArchivesAlone+ existingTestGroupArchivesByBaseName— unrelated archives stay separate.IsRarFile/SetKey/rarVolumeNumber/groupHasVolumeGaprollover cases.go build ./...,go test -race ./internal/importer/..., and./internal/api/all pass.Validation needed
This fixes import-time grouping/ordering. The byte-level RAR reconstruction across the renamed sequence relies on
rardecodefollowing<base>.rar→.r00→ … (which it does for normal old-style sets); it can't be exercised offline without the real volumes. Please re-import this NZB (existing metadata is partial and won't self-heal) and confirm streaming works end-to-end.