Skip to content

fix(importer): handle non-fixed-width multi-volume RAR volume naming#686

Merged
javi11 merged 1 commit into
mainfrom
session/epic-haibt-f582e5
Jun 14, 2026
Merged

fix(importer): handle non-fixed-width multi-volume RAR volume naming#686
javi11 merged 1 commit into
mainfrom
session/epic-haibt-f582e5

Conversation

@javi11

@javi11 javi11 commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Problem

Multi-volume RAR sets numbered without consistent zero-padding…part01.rar…part09.rar, then …part010.rar…part0NN.rar (literally part0 + the unpadded number) — failed to import and play. Example: a 259-volume, 27.2 GB stored REMUX imported with only 946 MB of segments (9 volumes) and failed at playback with:

[createUsenetReader] No segments to download
  start=0 end=27223903104 available_bytes=946944000 expected_file_size=27223903105

Root cause

The rardecode fork follows a multi-volume set by computing the next volume name at the first volume's digit width (nextNewVolName). After …part09.rar (2 digits) it asks for …part10.rar, but the real file is …part010.rar. This width mismatch broke the set in two places:

  1. Volume followingUsenetFileSystem.Open missed …part10.rar → rardecode stopped at volume 9.
  2. Part → segment mappingconvertAggregatedFilesToRarContent mapped rardecode's computed part.Path (…part10.rar) back to the NZB file via exact-name lookup, so even when all volumes were followed, only part01part09 matched and the rest were dropped → import failed with no files were successfully processed.

Fix

  • New leaf package internal/importer/rarnameSetKey / VolumeNumber / Scheme + regexes as the single source of truth. archive re-exports them and rar aliases archive.*. (A leaf package is required because archive → archive/iso → filesystem, so the helpers can't live in archive or rar without an import cycle.)
  • UsenetFileSystem & DecryptingFileSystem — on an exact-name miss, Open/Stat resolve a volume by (set, scheme, volume-number), so …part10.rar…part010.rar regardless of padding width.
  • convertAggregatedFilesToRarContent + nested mapNestedFile* — resolve part.Path → source the same way via a generic partLocator[T], so every followed volume's segments are attached.
  • Coverage guard in AnalyzeRarContentFromNzb — fail an import when followed volume bytes are < 80% of supplied volume bytes, surfacing truncated/incomplete sets at import time instead of at playback.

Tests

  • Unit tests: rarname (width-independent parsing), the volume_index/partLocator resolvers (incl. cross-set collision), the truncation guard, and the segment mapping (TestConvertAggregatedFilesWidthMismatch).
  • End-to-end import battery test TestImportBattery_RarWidthMismatchVolumeNaming, backed by a committed 14-volume fixture (part01part09, part010part014) generated by testdata/gen_fixtures.sh. Runs the full ProcessNzbFile pipeline over the in-memory fakepool and asserts full segment coverage.
  • Verified RED without the fallback (reproduces no files were successfully processed) and GREEN with it. go test -race ./internal/importer/... passes; build + gofmt clean.

Reviewer notes

  • The committed binary fixture (~212 KB) is generated by rar + a rename step in gen_fixtures.sh; CI needs no rar (fixtures are committed, loadManifest t.Skips if absent).
  • Encryption was investigated and ruled out: the archive is -m0 stored, unencrypted; has_password=true (from the NZB <meta password>) is harmless and the Password.txt in such releases is a filename de-obfuscation password, not a RAR password.

Multi-volume RAR sets numbered without consistent zero-padding
(part01..part09 then part010..part0NN) failed to import. rardecode follows
a set by computing the next volume name at the first volume's digit width
(part10.rar) while the real file is part010.rar, so volume following AND
the part->segment mapping both stopped after volume 9. The file imported
with only ~3.5% of its segments and failed at playback with
"No segments to download".

- Add internal/importer/rarname leaf package (SetKey / VolumeNumber /
  Scheme + regexes) as the single source of truth; archive re-exports them
  and rar aliases archive.* (avoids an import cycle: archive -> iso ->
  filesystem).
- UsenetFileSystem and DecryptingFileSystem now resolve a requested volume
  by (set, scheme, volume-number) when the exact filename misses, so width
  mismatches still follow the whole set.
- convertAggregatedFilesToRarContent and the nested map functions resolve
  rardecode part.Path -> NZB file the same way via a generic partLocator,
  so every followed volume's segments are attached.
- Add a coverage guard in AnalyzeRarContentFromNzb that fails an import
  when followed volume bytes are < 80% of supplied volume bytes (catches
  truncated sets at import instead of at playback).

Tests: unit tests for rarname, the volume_index/partLocator resolvers and
the segment mapping, plus an end-to-end import battery test
(TestImportBattery_RarWidthMismatchVolumeNaming) backed by a committed
14-volume fixture (part01..part09, part010..part014). Verified RED without
the fallback (reproduces "no files were successfully processed") and GREEN
with it.
@javi11 javi11 merged commit eb19a22 into main Jun 14, 2026
2 checks passed
@javi11 javi11 deleted the session/epic-haibt-f582e5 branch June 14, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant