Version 3 ldm opt#4
Conversation
|
Little trick : |
|
Compression ratio : Could you also add a comparison for a use case where I believe it can be achieved by using the same tarball of linux sources, If you are afraid of long compression delays when using |
|
@Cyan4973 Good suggestion - I've updated the original post with a benchmark comparing the case you mentioned, and indeed the new method seems to still maintain the compression ratio gains. I will be including much more comprehensive benchmarks in the actual version of the PR against the real Zstd repo, including some rough estimates on speed. |
|
Quick comparison on my laptop, compressing
Level What worries me though is the discovery that enabling the However, this effect is unrelated to this PR. It was present before! This probably warrants an investigation, though in a different follow up PR. cc @terrelln : the feature ( edit : this is confirmed : the reduction in nb of active threads come from the large expansion of job size when triggering the |
35e9845 to
ebf8794
Compare
| U32 ldmStartPosInBlock = 0; | ||
| U32 ldmEndPosInBlock = 0; | ||
| U32 ldmOffset = 0; | ||
|
|
There was a problem hiding this comment.
@terrelln So this comment below goes into more explicit detail about the surprising behavior I mentioned during our 1:1 and the code below is the my supposed "mitigation", basically pushing back adding the match by adding more bytes to the litLength of the ldm.
There was a problem hiding this comment.
@senhuang42 I see the problem now. This is not the correct solution. This will likely break on large files.
I believe this is caused by btultra2 which does 2 passes over the first block. You can double check by seeing if the problem goes away at level 18, which uses btultra, not btultra2. ` You'll need to reset the LDM seqStore after the first pass, to re-consume the sequences.
|
The multi-threading issue definitely seems to be just a configuration sort of issue - multithreading was enabled when I tried compressing a 5GB file with |
| bytesToSkip -= seq->litLength; | ||
| seq->litLength = 0; | ||
| if (bytesToSkip < seq->matchLength) { | ||
| seq->matchLength -= (U32)bytesToSkip; |
There was a problem hiding this comment.
What happens when matchLength < 3 after this line? That matchLength would be unrepresentable.
There was a problem hiding this comment.
I actually allow matchLength < 3 since: we will always reject any matchLength < MINMATCH later on when we try to add LDMs, and I wanted to keep the functions as simple as possible to avoid any issues with miscounting the number of bytes we actually skipped. It would definitely make sense to note this specifically though.
| if (remainingBytes <= currSeq.litLength) { | ||
| currSeq.offset = 0; | ||
| } else if (remainingBytes < currSeq.litLength + currSeq.matchLength) { | ||
| currSeq.matchLength = remainingBytes - currSeq.litLength; |
There was a problem hiding this comment.
Same question here about matchLength < 3.
| U32 ldmStartPosInBlock = 0; | ||
| U32 ldmEndPosInBlock = 0; | ||
| U32 ldmOffset = 0; | ||
|
|
There was a problem hiding this comment.
@senhuang42 I see the problem now. This is not the correct solution. This will likely break on large files.
I believe this is caused by btultra2 which does 2 passes over the first block. You can double check by seeing if the problem goes away at level 18, which uses btultra, not btultra2. ` You'll need to reset the LDM seqStore after the first pass, to re-consume the sequences.
|
@senhuang42 LDM + MT at level 19 will create 256 MB jobs by default. The code controlling it is here You can control it on the CLI with We should use the cycleLog instead of the chainLog. That will reduce it to a 128 MB job size. Then we could consider using cycleLog+3 instead of +4. We could also take the |
|
@terrelln the note about |
d6ff12c to
de77b83
Compare
ce09360 to
167276d
Compare
167276d to
05fd1be
Compare
a7c9887 to
09eedf6
Compare
Draft PR
Preliminary results regarding compressed size:
New algorithm approach: