Skip to content

Add custom chunk size support and fix total_chunk_hash calculation#1

Closed
fabudakt wants to merge 2 commits intoanthonychaussin:masterfrom
fabudakt:Upload-File-With-Specific-Chunk-Size
Closed

Add custom chunk size support and fix total_chunk_hash calculation#1
fabudakt wants to merge 2 commits intoanthonychaussin:masterfrom
fabudakt:Upload-File-With-Specific-Chunk-Size

Conversation

@fabudakt
Copy link
Contributor

@fabudakt fabudakt commented Oct 9, 2025

  • Add optional chunkSize parameter to KDriveClient constructor
  • Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings
  • Use IncrementalHash for efficient hash computation during file chunking

- Add optional chunkSize parameter to KDriveClient constructor
- Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings
- Use IncrementalHash for efficient hash computation during file chunking
@anthonychaussin anthonychaussin self-assigned this Oct 9, 2025
@anthonychaussin
Copy link
Owner

Hello !

Thank you for your contribution, I reread it with great pleasure.
I'm looking at what I can keep, because unfortunately the hash calculation is not correct :/

I know that in the documentation, they say that it corresponds to the hash of all the chunks, but I realized that this is not true. I had the same code as you at the beginning.
I assumed that I had misinterpreted something, but I think it's really a mistake on their part.

Here is the result when I try the code you suggest with my API token:

kDriveClient.Models.Exceptions.KDriveApiException : 'upload_failed_error: upload hash mismatch. Expected: sha256:8efea68f2699b7f267b545c1dd1760089caf76ac41350167a2a83f20b581285e , received: sha256:7d857bb8e53d9bb7f6e8334b6635e1c56c8cf4dcae1b60cfa82a2af40447cfad'

If you have any ideas, I'm all ears!
I spent a long time trying to figure out what was wrong with this part before giving up.

I'll start by updating it with the rest of your code, which is flawless.

Thank you for your contribution.

Comment on lines +84 to +89
this.Chunks.Add(new KDriveChunk(content, chunkNumber++, chunkHash));

// Add chunk hash hex string (lowercase) to compute total_chunk_hash
// According to kDrive API, total_chunk_hash is the hash of concatenated chunk hash hex strings
var chunkHashHex = Convert.ToHexString(chunkHash).ToLowerInvariant();
fileSha256.AppendData(Encoding.UTF8.GetBytes(chunkHashHex));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello,

Unfortunately, it doesn't work.
I don't know if it's a misinterpretation of their documentation or if it's wrong, but I did the same thing at first and it doesn't work.

I had this code:

        public void SplitIntoChunks(int chunkSize)
        {
            var buffer = new byte[chunkSize];
            int chunkNumber = 0;
            int bytesRead;

            this.Content.Position = 0;

            while ((bytesRead = this.Content.Read(buffer, 0, chunkSize)) > 0)
            {
                byte[] content = [.. buffer.Take(bytesRead)];
                this.Chunks.Add(new KDriveChunk(content, chunkNumber++, SHA256.HashData(content)));
            }


            // Add chunk hash hex string (lowercase) to compute total_chunk_hash
            // According to kDrive API, total_chunk_hash is the hash of concatenated chunk hash hex strings
            // Compute and store the total chunk hash (hash of all chunk hash hex strings concatenated)
            this.TotalChunkHash = Convert.ToHexString(SHA256.HashData(Encoding.UTF8.GetBytes(String.Join(String.Empty, this.Chunks.ConvertAll(c => c.ChunkHash.ToLower())))));

            this.Content.Position = 0;
        }

And it gives me the same hash verification error as in my other comment:

kDriveClient.Models.Exceptions.KDriveApiException : 'upload_failed_error: upload hash mismatch. Expected: sha256:8efea68f2699b7f267b545c1dd1760089caf76ac41350167a2a83f20b581285e , received: sha256:7d857bb8e53d9bb7f6e8334b6635e1c56c8cf4dcae1b60cfa82a2af40447cfad'

I haven't found another solution yet, nor why this behavior occurs :/

Comment on lines +70 to +80
if (!customChunkSize.HasValue)
{
DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9);
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
}
else
{
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!customChunkSize.HasValue)
{
DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9);
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
}
else
{
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
}
private async Task InitializeUploadStrategyAsync(int? customChunkSize = null, CancellationToken ct = default)
{
if (customChunkSize != null)
{
DynamicChunkSizeBytes = (int)customChunkSize;
DirectUploadThresholdBytes = (int)customChunkSize;
this.Logger?.LogInformation("Custom chunk size is provided. Speed test is no longer needed");
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
return;
}
this.Logger?.LogInformation("Starting upload strategy initialization...");
var buffer = new byte[1024 * 1024];
RandomNumberGenerator.Fill(buffer);
this.Logger?.LogInformation("Generated test Data of size {Size} bytes.", buffer.Length);
var testFile = new Models.KDriveFile
{
Name = "speedtest.dat",
DirectoryPath = "/Private",
Content = new ByteArrayContent(buffer)
{
Headers =
{
ContentType = new MediaTypeHeaderValue("application/octet-stream"),
ContentLength = buffer.Length
}
}.ReadAsStream(ct)
};
testFile.SplitIntoChunks(buffer.Length);
this.Logger?.LogInformation("Test file created with {ChunkCount} chunks of size {ChunkSize} bytes.", testFile.Chunks.Count, buffer.Length);
this.Logger?.LogInformation("Starting upload session for speed test...");
var (SessionToken, UploadUrl) = await StartUploadSessionAsync(testFile, ct);
this.Logger?.LogInformation("Upload session started with token: {SessionToken} and URL: {UploadUrl}", SessionToken, UploadUrl);
this.Logger?.LogInformation("Uploading first chunk of size {ChunkSize} bytes...", testFile.Chunks.First().ChunkSize);
var chunkRequest = KDriveRequestFactory.CreateChunkUploadRequest(UploadUrl, SessionToken, this.DriveId, testFile.Chunks.First());
var sw = System.Diagnostics.Stopwatch.StartNew();
var response = await SendAsync(chunkRequest, ct);
sw.Stop();
this.Logger?.LogInformation("Chunk upload completed in {ElapsedMilliseconds} ms.", sw.ElapsedMilliseconds);
try
{
response.EnsureSuccessStatusCode();
}
catch (HttpRequestException ex)
{
this.Logger?.LogError(ex, "Failed to upload chunk: {Message}", ex.Message);
throw;
}
this.Logger?.LogInformation("Chunk uploaded successfully. Response: {Response}", await response.Content.ReadAsStringAsync(ct));
this.Logger?.LogInformation("Finalizing upload session...");
await CancelUploadSessionRequest(SessionToken, ct);
this.Logger?.LogInformation("Upload session finalized successfully.");
var speedBytesPerSec = buffer.Length / (sw.ElapsedMilliseconds / 1000.0);
DirectUploadThresholdBytes = (long)speedBytesPerSec;
DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9);
this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)",
DirectUploadThresholdBytes, DynamicChunkSizeBytes);
}

Hi,

I think that since the speed test is mainly used to determine the size of the chunks,
if we provide a custom size, we can bypass the speed test.
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sure that's the way better solution.

The hash mismatch error was caused by sending an incorrect total_chunk_hash value in the finish upload session request.

Root Cause
The kDrive API does NOT require (and actually rejects) the total_chunk_hash parameter in the finish session request. The API validates file integrity using the individual chunk hashes that are sent with each chunk upload.
@fabudakt
Copy link
Contributor Author

Hello !

Thank you for your contribution, I reread it with great pleasure. I'm looking at what I can keep, because unfortunately the hash calculation is not correct :/

I know that in the documentation, they say that it corresponds to the hash of all the chunks, but I realized that this is not true. I had the same code as you at the beginning. I assumed that I had misinterpreted something, but I think it's really a mistake on their part.

Here is the result when I try the code you suggest with my API token:

kDriveClient.Models.Exceptions.KDriveApiException : 'upload_failed_error: upload hash mismatch. Expected: sha256:8efea68f2699b7f267b545c1dd1760089caf76ac41350167a2a83f20b581285e , received: sha256:7d857bb8e53d9bb7f6e8334b6635e1c56c8cf4dcae1b60cfa82a2af40447cfad'

If you have any ideas, I'm all ears! I spent a long time trying to figure out what was wrong with this part before giving up.

I'll start by updating it with the rest of your code, which is flawless.

Thank you for your contribution.

Hey there!
I see, strange. I could upload different files before.

I pushed a fix which takes another approach. I could upload a 5MB, 200MB and a 1.6 GB file like this.
Can you try if it works for you?

Thanks

@anthonychaussin
Copy link
Owner

Hi,

I've found my mistake !
Thanks for your help, I will soon push a new version with your changes and some other improvements

I just need to run a few tests on > 1GB files

anthonychaussin added a commit that referenced this pull request Oct 14, 2025
@anthonychaussin anthonychaussin mentioned this pull request Oct 14, 2025
anthonychaussin added a commit that referenced this pull request Oct 15, 2025
V1.0.3

Improve chunk size estimation
Improve memory management
Complete kDrive File Object
Fix upload response wrapper
fix hash computation
bypass speed test if custom chunk size is provided
fix user-agent version
Changes from Add custom chunk size support and fix total_chunk_hash calculation #1
Add custom chunk size support and fix total_chunk_hash calculation
Add optional chunkSize parameter to KDriveClient constructor
Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings
Use IncrementalHash for efficient hash computation during file chunking
@anthonychaussin
Copy link
Owner

The changes are on master ;)
Feel free to open an issue or a new PR if you have other contrib and thanks for your help !

@fabudakt fabudakt deleted the Upload-File-With-Specific-Chunk-Size branch October 15, 2025 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants