Add custom chunk size support and fix total_chunk_hash calculation#1
Add custom chunk size support and fix total_chunk_hash calculation#1fabudakt wants to merge 2 commits intoanthonychaussin:masterfrom
Conversation
fabudakt
commented
Oct 9, 2025
- Add optional chunkSize parameter to KDriveClient constructor
- Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings
- Use IncrementalHash for efficient hash computation during file chunking
- Add optional chunkSize parameter to KDriveClient constructor - Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings - Use IncrementalHash for efficient hash computation during file chunking
|
Hello ! Thank you for your contribution, I reread it with great pleasure. I know that in the documentation, they say that it corresponds to the hash of all the chunks, but I realized that this is not true. I had the same code as you at the beginning. Here is the result when I try the code you suggest with my API token: kDriveClient.Models.Exceptions.KDriveApiException : 'upload_failed_error: upload hash mismatch. Expected: sha256:8efea68f2699b7f267b545c1dd1760089caf76ac41350167a2a83f20b581285e , received: sha256:7d857bb8e53d9bb7f6e8334b6635e1c56c8cf4dcae1b60cfa82a2af40447cfad'If you have any ideas, I'm all ears! I'll start by updating it with the rest of your code, which is flawless. Thank you for your contribution. |
DriveClient/Models/KDriveFile.cs
Outdated
| this.Chunks.Add(new KDriveChunk(content, chunkNumber++, chunkHash)); | ||
|
|
||
| // Add chunk hash hex string (lowercase) to compute total_chunk_hash | ||
| // According to kDrive API, total_chunk_hash is the hash of concatenated chunk hash hex strings | ||
| var chunkHashHex = Convert.ToHexString(chunkHash).ToLowerInvariant(); | ||
| fileSha256.AppendData(Encoding.UTF8.GetBytes(chunkHashHex)); |
There was a problem hiding this comment.
Hello,
Unfortunately, it doesn't work.
I don't know if it's a misinterpretation of their documentation or if it's wrong, but I did the same thing at first and it doesn't work.
I had this code:
public void SplitIntoChunks(int chunkSize)
{
var buffer = new byte[chunkSize];
int chunkNumber = 0;
int bytesRead;
this.Content.Position = 0;
while ((bytesRead = this.Content.Read(buffer, 0, chunkSize)) > 0)
{
byte[] content = [.. buffer.Take(bytesRead)];
this.Chunks.Add(new KDriveChunk(content, chunkNumber++, SHA256.HashData(content)));
}
// Add chunk hash hex string (lowercase) to compute total_chunk_hash
// According to kDrive API, total_chunk_hash is the hash of concatenated chunk hash hex strings
// Compute and store the total chunk hash (hash of all chunk hash hex strings concatenated)
this.TotalChunkHash = Convert.ToHexString(SHA256.HashData(Encoding.UTF8.GetBytes(String.Join(String.Empty, this.Chunks.ConvertAll(c => c.ChunkHash.ToLower())))));
this.Content.Position = 0;
}And it gives me the same hash verification error as in my other comment:
kDriveClient.Models.Exceptions.KDriveApiException : 'upload_failed_error: upload hash mismatch. Expected: sha256:8efea68f2699b7f267b545c1dd1760089caf76ac41350167a2a83f20b581285e , received: sha256:7d857bb8e53d9bb7f6e8334b6635e1c56c8cf4dcae1b60cfa82a2af40447cfad'I haven't found another solution yet, nor why this behavior occurs :/
| if (!customChunkSize.HasValue) | ||
| { | ||
| DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9); | ||
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)", | ||
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | ||
| } | ||
| else | ||
| { | ||
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)", | ||
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | ||
| } |
There was a problem hiding this comment.
| if (!customChunkSize.HasValue) | |
| { | |
| DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9); | |
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)", | |
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | |
| } | |
| else | |
| { | |
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)", | |
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | |
| } | |
| private async Task InitializeUploadStrategyAsync(int? customChunkSize = null, CancellationToken ct = default) | |
| { | |
| if (customChunkSize != null) | |
| { | |
| DynamicChunkSizeBytes = (int)customChunkSize; | |
| DirectUploadThresholdBytes = (int)customChunkSize; | |
| this.Logger?.LogInformation("Custom chunk size is provided. Speed test is no longer needed"); | |
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (custom)", | |
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | |
| return; | |
| } | |
| this.Logger?.LogInformation("Starting upload strategy initialization..."); | |
| var buffer = new byte[1024 * 1024]; | |
| RandomNumberGenerator.Fill(buffer); | |
| this.Logger?.LogInformation("Generated test Data of size {Size} bytes.", buffer.Length); | |
| var testFile = new Models.KDriveFile | |
| { | |
| Name = "speedtest.dat", | |
| DirectoryPath = "/Private", | |
| Content = new ByteArrayContent(buffer) | |
| { | |
| Headers = | |
| { | |
| ContentType = new MediaTypeHeaderValue("application/octet-stream"), | |
| ContentLength = buffer.Length | |
| } | |
| }.ReadAsStream(ct) | |
| }; | |
| testFile.SplitIntoChunks(buffer.Length); | |
| this.Logger?.LogInformation("Test file created with {ChunkCount} chunks of size {ChunkSize} bytes.", testFile.Chunks.Count, buffer.Length); | |
| this.Logger?.LogInformation("Starting upload session for speed test..."); | |
| var (SessionToken, UploadUrl) = await StartUploadSessionAsync(testFile, ct); | |
| this.Logger?.LogInformation("Upload session started with token: {SessionToken} and URL: {UploadUrl}", SessionToken, UploadUrl); | |
| this.Logger?.LogInformation("Uploading first chunk of size {ChunkSize} bytes...", testFile.Chunks.First().ChunkSize); | |
| var chunkRequest = KDriveRequestFactory.CreateChunkUploadRequest(UploadUrl, SessionToken, this.DriveId, testFile.Chunks.First()); | |
| var sw = System.Diagnostics.Stopwatch.StartNew(); | |
| var response = await SendAsync(chunkRequest, ct); | |
| sw.Stop(); | |
| this.Logger?.LogInformation("Chunk upload completed in {ElapsedMilliseconds} ms.", sw.ElapsedMilliseconds); | |
| try | |
| { | |
| response.EnsureSuccessStatusCode(); | |
| } | |
| catch (HttpRequestException ex) | |
| { | |
| this.Logger?.LogError(ex, "Failed to upload chunk: {Message}", ex.Message); | |
| throw; | |
| } | |
| this.Logger?.LogInformation("Chunk uploaded successfully. Response: {Response}", await response.Content.ReadAsStringAsync(ct)); | |
| this.Logger?.LogInformation("Finalizing upload session..."); | |
| await CancelUploadSessionRequest(SessionToken, ct); | |
| this.Logger?.LogInformation("Upload session finalized successfully."); | |
| var speedBytesPerSec = buffer.Length / (sw.ElapsedMilliseconds / 1000.0); | |
| DirectUploadThresholdBytes = (long)speedBytesPerSec; | |
| DynamicChunkSizeBytes = (int)(speedBytesPerSec * 0.9); | |
| this.Logger?.LogInformation("Upload strategy initialized: DirectUploadThresholdBytes = {DirectUploadThresholdBytes}, DynamicChunkSizeBytes = {DynamicChunkSizeBytes} (calculated from speed test)", | |
| DirectUploadThresholdBytes, DynamicChunkSizeBytes); | |
| } |
Hi,
I think that since the speed test is mainly used to determine the size of the chunks,
if we provide a custom size, we can bypass the speed test.
What do you think?
There was a problem hiding this comment.
yeah, sure that's the way better solution.
The hash mismatch error was caused by sending an incorrect total_chunk_hash value in the finish upload session request. Root Cause The kDrive API does NOT require (and actually rejects) the total_chunk_hash parameter in the finish session request. The API validates file integrity using the individual chunk hashes that are sent with each chunk upload.
Hey there! I pushed a fix which takes another approach. I could upload a 5MB, 200MB and a 1.6 GB file like this. Thanks |
|
Hi, I've found my mistake ! I just need to run a few tests on > 1GB files |
V1.0.3 Improve chunk size estimation Improve memory management Complete kDrive File Object Fix upload response wrapper fix hash computation bypass speed test if custom chunk size is provided fix user-agent version Changes from Add custom chunk size support and fix total_chunk_hash calculation #1 Add custom chunk size support and fix total_chunk_hash calculation Add optional chunkSize parameter to KDriveClient constructor Fix total_chunk_hash calculation to hash concatenated chunk hash hex strings Use IncrementalHash for efficient hash computation during file chunking
|
The changes are on master ;) |