Skip to content

extract, list: auto-detect and decompress gzip archives#168

Open
kevinburke wants to merge 1 commit into
uutils:mainfrom
kevinburke:kb-gzip-support
Open

extract, list: auto-detect and decompress gzip archives#168
kevinburke wants to merge 1 commit into
uutils:mainfrom
kevinburke:kb-gzip-support

Conversation

@kevinburke
Copy link
Copy Markdown

Previously, extract and list passed the raw file bytes directly to the tar parser without decompression. When given a .tar.gz file, the compressed gzip stream was interpreted as tar headers, producing errors like "numeric field did not have utf-8 text" on the checksum field.

Detect gzip compression by reading the two-byte magic number (0x1f 0x8b) at the start of the file, and wrap the reader in a GzDecoder when present. Plain .tar files continue to work as before.

Confirmed this patch allows extraction of Go source code from https://go.dev/dl/ (previously we would get an error).

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 10, 2026

Codecov Report

❌ Patch coverage is 96.31148% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.54%. Comparing base (2c7a5e6) to head (63389ab).
⚠️ Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
src/uu/tar/src/compression.rs 83.92% 9 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #168       +/-   ##
===========================================
+ Coverage   60.83%   96.54%   +35.70%     
===========================================
  Files           7       10        +3     
  Lines         789     1157      +368     
  Branches       24       26        +2     
===========================================
+ Hits          480     1117      +637     
+ Misses        309       39      -270     
- Partials        0        1        +1     
Flag Coverage Δ
macos_latest 96.54% <96.31%> (+35.70%) ⬆️
ubuntu_latest 96.54% <96.31%> (+35.70%) ⬆️
windows_latest 0.00% <0.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kaladron
Copy link
Copy Markdown
Collaborator

Hi Kevin! I'm holding this for just a bit while I get the support for reading from stdin in as that will change some function interfaces. I'll review this right after.

@kaladron kaladron self-requested a review April 10, 2026 06:38
@kaladron kaladron self-assigned this Apr 10, 2026
@kaladron
Copy link
Copy Markdown
Collaborator

Please tage #158 in the commit description.

@kevinburke
Copy link
Copy Markdown
Author

I see quite a lot of patches just came in; anything I can help review?

@kaladron
Copy link
Copy Markdown
Collaborator

I see quite a lot of patches just came in; anything I can help review?

I filed Issues for each piece of missing functionality so that we can coordinate. If more people are showing up, I want them to be able to claim an issue rather than have any conflict. But I'll tag you in two others (that are blocking these at the moment)

@kevinburke kevinburke force-pushed the kb-gzip-support branch 3 times, most recently from b98b8d5 to 08cf2ca Compare April 13, 2026 16:15
Teach tar to auto-detect gzip-compressed archives for list and extract
operations while keeping archive creation explicit via -z/--gzip.

The implementation now routes archive I/O through a shared compression
helper so the read path can sniff gzip input and the write path can wrap
output in gzip only when requested.

Add integration tests covering gzip create, list, extract, explicit -z
on extract/list, round-tripping a gzip archive, and invalid gzip input
failure behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants