Skip to content

fix(importer): reconstruct multi-volume RAR sets with reposted/rollover volumes#683

Merged
javi11 merged 2 commits into
mainfrom
session/suspicious-noether-3ea5e3
Jun 13, 2026
Merged

fix(importer): reconstruct multi-volume RAR sets with reposted/rollover volumes#683
javi11 merged 2 commits into
mainfrom
session/suspicious-noether-3ea5e3

Conversation

@javi11

@javi11 javi11 commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Problem

Streaming an extracted .mkv failed with:

[createUsenetReader] No segments to download start=0 end=11575329534
  available_bytes=102400000 expected_file_size=11575329535

The release is one physical multi-volume RAR set with two quirks that broke import:

  1. Reposted first volume under a different base name. The only volume with a real RAR header is …x264-monkee.repost.part01.rar, while every continuation keeps the original base: …x264-monkee.r00.r99 then the old-style rollover …x264-monkee.s00.s12. There is no plain …x264-monkee.rar.
  2. Old-style letter rollover past .r99 into .s00.z99, which AltMount didn't recognize at all.

GroupArchivesByBaseName split this single set by base name, isolating part01.rar (base …monkee.repost) from its continuations (base …monkee). Only the part01.rar group had a usable archive header, so rardecode mapped just that one volume = exactly 102400000 bytes (one -v100000k volume), truncating the ~11.5 GB .mkv.

Confirmed with a deterministic reproduction at the grouping seam (15 groups pre-fix; the part01.rar group analyzed alone).

Fix

Two composing changes:

  • .sNN.zNN rollover recognition (utils.go RollVolPattern/rarVolumeNumber, detector.go, file_extensions.go, file_handlers.go, sevenzip): the continuation letter rolls r→s→…→z after .r99. rarVolumeNumber now returns a contiguous ordinal across the boundary (.r99=100, .s00=101) so the continuations group and order correctly.
  • Reposted-first-volume reconciliation (utils.go reconcileRepostedFirstVolume, wired into GroupArchivesByBaseName): when exactly one group is a lone first volume and the others are pure continuations sharing one scheme and a base the first volume's base merely extends by a .<suffix> (the repost signal), fold them into one set, rename the first volume to <base>.rar, and order continuations by volume number. rardecode then follows the whole archive natively via nextOldVolName.

Conservative guards keep genuinely separate archives untouched (unrelated bases, multiple first volumes, multi-volume starters) — verified by the existing TestGroupArchivesByBaseName (a continuation set + an unrelated single-part archive stays 2 groups).

Tests

  • TestGroupArchivesReconcilesRepostedFirstVolume — the real Zero.Effect shape (114 volumes) collapses to one group with the first volume renamed to <base>.rar.
  • TestReconcileLeavesSeparateArchivesAlone + existing TestGroupArchivesByBaseName — unrelated archives stay separate.
  • IsRarFile/SetKey/rarVolumeNumber/groupHasVolumeGap rollover cases.

go build ./..., go test -race ./internal/importer/..., and ./internal/api/ all pass.

Validation needed

This fixes import-time grouping/ordering. The byte-level RAR reconstruction across the renamed sequence relies on rardecode following <base>.rar.r00 → … (which it does for normal old-style sets); it can't be exercised offline without the real volumes. Please re-import this NZB (existing metadata is partial and won't self-heal) and confirm streaming works end-to-end.

javi11 added 2 commits June 13, 2026 18:46
Old-style multi-volume RAR sets roll the extension letter after .r99
(.r99 -> .s00 -> ... -> .z99). AltMount only recognized .rar, .r00-.r99
and .partNN.rar, so SetKey() returned no group key for the rollover
volumes and each became its own singleton group. rardecode was then
handed a filesystem missing the continuation volumes, could not follow
.r99 -> .s00, and truncated the extracted media file's segment map to a
single volume — causing streaming to fail with "No segments to download"
(available_bytes ~= one volume vs the full expected file size).

Teach every RAR-volume recognizer about the r->z + two-digit rollover,
mirroring rardecode's nextOldVolName:

- archive/rar/utils.go: add RollVolPattern; SetKey groups .sNN-.zNN with
  the set; rarVolumeNumber returns a contiguous ordinal across the letter
  boundary (.r99=100, .s00=101) so gap detection does not misfire.
- archive/rar/processor.go & archive/sevenzip/processor.go:
  isRarArchiveFile / parseRarFilename handle rollover volumes.
- parser/fileinfo/detector.go, importer/utils/file_extensions.go,
  api/file_handlers.go: extend the duplicated rarPattern regexes.

Regression tests cover IsRarFile, SetKey, rarVolumeNumber,
groupHasVolumeGap rollover cases, and a 114-volume set collapsing to a
single group.
Some Usenet posts repost a multi-volume RAR set's first volume under a
different base name (e.g. movie.repost.part01.rar) while the continuation
volumes keep the original base (movie.r00..r99 / .s00..s12, no plain
movie.rar). GroupArchivesByBaseName split these by base name, isolating
the only volume with a real archive header from its continuations — so
rardecode mapped just that one volume (e.g. a single ~100k volume),
truncating the extracted media file and breaking streaming with
"No segments to download".

When exactly one group is a lone first volume and every other group is
pure continuations sharing one roll/numeric scheme AND a base that the
first volume's base merely extends by a ".<suffix>" (the repost signal),
fold them into one set: rename the first volume to <base>.rar and order
the continuations by volume number so rardecode follows the whole archive
natively. Guards keep genuinely separate archives (unrelated bases,
multiple first volumes, multi-volume starters) untouched.

Depends on the .sNN rollover recognition in this branch, which is what
groups .s00..s12 with .r00..r99 and orders them contiguously.
@javi11 javi11 changed the title fix(importer): recognize old-style RAR rollover volumes (.s00…/.z99) fix(importer): reconstruct multi-volume RAR sets with reposted/rollover volumes Jun 13, 2026
@javi11 javi11 merged commit 6a8a740 into main Jun 13, 2026
2 checks passed
@javi11 javi11 deleted the session/suspicious-noether-3ea5e3 branch June 13, 2026 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant