feat: add chunk by query methods (#714)#1281
feat: add chunk by query methods (#714)#1281arashackdev wants to merge 3 commits intogoravel:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #1281 +/- ##
=======================================
Coverage 68.56% 68.56%
=======================================
Files 264 264
Lines 15566 15566
=======================================
Hits 10673 10673
Misses 4430 4430
Partials 463 463 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
853bd60 to
9232f6d
Compare
|
fatal: invalid reference: feature/chunk-by-queries
Error: Invalid status code: 128
at ChildProcess.<anonymous> (/home/runner/work/_actions/stefanzweifel/git-auto-commit-action/v7/index.js:17:19)
at ChildProcess.emit (node:events:508:28)
at maybeClose (node:internal/child_process:1101:16)
at ChildProcess._handle.onexit (node:internal/child_process:305:5) {
code: 128
}
Error: Invalid status code: 128
at ChildProcess.<anonymous> (/home/runner/work/_actions/stefanzweifel/git-auto-commit-action/v7/index.js:17:19)
at ChildProcess.emit (node:events:508:28)
at maybeClose (node:internal/child_process:1101:16)
at ChildProcess._handle.onexit (node:internal/child_process:305:5)
Edit: Fixed by regenerating mock file. |
|
Review Ready. |
|
Thanks @arashackdev, I'll check it. |
There was a problem hiding this comment.
Pull request overview
This PR adds chunking functionality to the ORM Query interface, inspired by Laravel's chunking methods. These methods enable efficient processing of large datasets in batches, reducing memory consumption.
- Introduces four new chunking methods:
Chunk,ChunkByID,ChunkByIDDesc, andOrderedChunkByID - Adds validation error for invalid chunk sizes (
OrmQueryChunkZeroOrLess) - Includes mock implementations for testing
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
contracts/database/orm/orm.go |
Adds interface definitions for the four new chunking methods with documentation |
database/gorm/query.go |
Implements the chunking logic with offset-based (Chunk) and ID-based (ChunkByID, ChunkByIDDesc, OrderedChunkByID) pagination strategies |
errors/list.go |
Adds OrmQueryChunkZeroOrLess error for chunk count validation |
mocks/database/orm/Query.go |
Auto-generated mock implementations for the new chunking methods |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
contracts/database/orm/orm.go
Outdated
| BeginTransaction() (Query, error) | ||
| // Chunk processes a given number of records in batches. | ||
| Chunk(count int, callback func([]any) error) error | ||
| // Chunk the results of a query by comparing numeric IDs. |
There was a problem hiding this comment.
The documentation comment is incorrect. It should describe the ChunkByID method, not be a duplicate of the Chunk comment.
The comment should indicate that ChunkByID chunks results by comparing numeric IDs in ascending order, which avoids issues with offset-based pagination when records are added/deleted during processing.
| // Chunk the results of a query by comparing numeric IDs. | |
| // ChunkByID processes records in batches by comparing numeric IDs in ascending order. | |
| // This avoids issues with offset-based pagination when records are added or deleted during processing. |
| func (r *Query) Chunk(count int, callback func([]any) error) error { | ||
| if count <= 0 { | ||
| return errors.OrmQueryChunkZeroOrLess | ||
| } | ||
|
|
||
| var remaining *int | ||
| if r.conditions.limit != nil { | ||
| limit := *r.conditions.limit | ||
| remaining = &limit | ||
| } | ||
|
|
||
| query := r.addGlobalScopes().buildConditions() | ||
|
|
||
| destType := reflect.TypeOf(r.conditions.model) | ||
| sliceType := reflect.SliceOf(reflect.PointerTo(destType)) | ||
| offset := 0 | ||
|
|
||
| for remaining == nil || *remaining > 0 { | ||
| chunkLimit := count | ||
| if remaining != nil && *remaining < count { | ||
| chunkLimit = *remaining | ||
| } | ||
|
|
||
| results := reflect.New(sliceType).Interface() | ||
|
|
||
| chunkQuery := query.Offset(offset).Limit(chunkLimit).(*Query) | ||
| err := chunkQuery.Find(results) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| resultsValue := reflect.ValueOf(results).Elem() | ||
| length := resultsValue.Len() | ||
| if length == 0 { | ||
| return nil | ||
| } | ||
|
|
||
| if remaining != nil { | ||
| *remaining = max(*remaining-length, 0) | ||
| } | ||
|
|
||
| values := make([]any, length) | ||
| for i := 0; i < length; i++ { | ||
| values[i] = resultsValue.Index(i).Interface() | ||
| } | ||
|
|
||
| if err = callback(values); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if length < chunkLimit { | ||
| return nil | ||
| } | ||
|
|
||
| offset += chunkLimit | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
The new chunking methods (Chunk, ChunkByID, ChunkByIDDesc, and OrderedChunkByID) lack test coverage. Given that the repository includes comprehensive test coverage for other query methods (e.g., in database/gorm/query_test.go), these new methods should also have corresponding tests to verify:
- Correct chunking behavior with various count values
- Proper handling of offset and limit constraints
- Error handling for invalid count values (count <= 0)
- Callback error propagation
- Correct ordering behavior for ChunkByID variants
- Edge cases like empty results or results smaller than chunk size
There was a problem hiding this comment.
+1, and please add test cases in tests/query_test.go as well.
| return errors.OrmQueryChunkZeroOrLess | ||
| } | ||
|
|
||
| column := "id" |
There was a problem hiding this comment.
The OrderedChunkByID method hardcodes the primary key column name as "id" (line 265), which assumes all models use this column name. This could cause issues when:
- Models use a different primary key column name (e.g.,
user_id,uuid) - The database column name differs from the struct field name due to GORM's
column:tag
While database.GetID can retrieve the primary key value using reflection and GORM tags, there's no corresponding mechanism to get the actual database column name for use in the WHERE clauses (lines 305, 307) and ORDER BY clauses (lines 298, 300).
Consider either:
- Adding a parameter to allow users to specify a custom column name (similar to Laravel's implementation)
- Creating a utility function to extract the primary key column name from GORM model metadata
- Documenting this limitation clearly in the method's documentation
hwbrzzl
left a comment
There was a problem hiding this comment.
Thanks, great PR 👍 Left a few questions.
contracts/database/orm/orm.go
Outdated
| // OrderByRaw specifies the order should be raw. | ||
| OrderByRaw(raw string) Query | ||
| // OrderedChunkByID processes a given number of records in batches, ordered by ID. | ||
| OrderedChunkByID(count int, callback func([]any) error, descending bool) error |
There was a problem hiding this comment.
It's unnecessary to expose this function.
There was a problem hiding this comment.
The functionality is indeed achieved by the other two added methods. I exposed it since it existed in the Laravel interface, but I'm fine with keeping it private. (changed)
| func (r *Query) Chunk(count int, callback func([]any) error) error { | ||
| if count <= 0 { | ||
| return errors.OrmQueryChunkZeroOrLess | ||
| } | ||
|
|
||
| var remaining *int | ||
| if r.conditions.limit != nil { | ||
| limit := *r.conditions.limit | ||
| remaining = &limit | ||
| } | ||
|
|
||
| query := r.addGlobalScopes().buildConditions() | ||
|
|
||
| destType := reflect.TypeOf(r.conditions.model) | ||
| sliceType := reflect.SliceOf(reflect.PointerTo(destType)) | ||
| offset := 0 | ||
|
|
||
| for remaining == nil || *remaining > 0 { | ||
| chunkLimit := count | ||
| if remaining != nil && *remaining < count { | ||
| chunkLimit = *remaining | ||
| } | ||
|
|
||
| results := reflect.New(sliceType).Interface() | ||
|
|
||
| chunkQuery := query.Offset(offset).Limit(chunkLimit).(*Query) | ||
| err := chunkQuery.Find(results) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| resultsValue := reflect.ValueOf(results).Elem() | ||
| length := resultsValue.Len() | ||
| if length == 0 { | ||
| return nil | ||
| } | ||
|
|
||
| if remaining != nil { | ||
| *remaining = max(*remaining-length, 0) | ||
| } | ||
|
|
||
| values := make([]any, length) | ||
| for i := 0; i < length; i++ { | ||
| values[i] = resultsValue.Index(i).Interface() | ||
| } | ||
|
|
||
| if err = callback(values); err != nil { | ||
| return err | ||
| } | ||
|
|
||
| if length < chunkLimit { | ||
| return nil | ||
| } | ||
|
|
||
| offset += chunkLimit | ||
| } | ||
|
|
||
| return nil | ||
| } |
There was a problem hiding this comment.
+1, and please add test cases in tests/query_test.go as well.
There was a problem hiding this comment.
Could you add the same functions in db.go?
| ChunkByID(count int, callback func([]any) error) error | ||
| // ChunkByIDDesc processes a given number of records in batches, ordered by ID in descending order. | ||
| ChunkByIDDesc(count int, callback func([]any) error) error |
There was a problem hiding this comment.
The default column is id.
| ChunkByID(count int, callback func([]any) error) error | |
| // ChunkByIDDesc processes a given number of records in batches, ordered by ID in descending order. | |
| ChunkByIDDesc(count int, callback func([]any) error) error | |
| ChunkByID(count int, callback func([]any) error, column ...string) error | |
| // ChunkByIDDesc processes a given number of records in batches, ordered by ID in descending order. | |
| ChunkByIDDesc(count int, callback func([]any) error, column ...string) error |
| } | ||
|
|
||
| lastRecord := values[length-1] | ||
| lastID = database.GetID(lastRecord) |
…handling - Enhanced ChunkByID method documentation to clarify its purpose. - Removed the OrderedChunkByID method and replaced its usage with an internal orderedChunkByID method. - Added validation for nil model conditions in Chunk and orderedChunkByID methods. - Updated mock definitions to reflect the removal of OrderedChunkByID.
📑 Description
Addresses goravel/goravel#714 by adding chunk by methods.
This PR adds four new methods to the ORM:
ChunkChunkByIDChunkByIDDesc: Exists in Laravel and is a low-hanging fruit to add here.OrderedChunkByID: Exists in Laravel and is an enabler to the previous two methods.