tp: add IndexedFilterIn bytecode for In on indexed columns#5158
Open
LalitMaganti wants to merge 17 commits intomainfrom
Open
tp: add IndexedFilterIn bytecode for In on indexed columns#5158LalitMaganti wants to merge 17 commits intomainfrom
LalitMaganti wants to merge 17 commits intomainfrom
Conversation
🎨 Perfetto UI Builds
|
f19f7dd to
ad716eb
Compare
Add SetFilterValueListUnchecked to TypedCursor allowing callers to pass a pointer+size array of FilterValue for In filters without allocation. Plumb this through the codegen'd ConstCursor/Cursor. Optimize the In bytecode by pre-building lookup structures during CastFilterValueList instead of rebuilding on every Execute(): - For dense Id/Uint32: BitVector (built once, not per-call) - For large sparse integer/string lists: FlatHashMapV2 for O(1) - For small lists (<=16): linear scan (cache-friendly) The lookup is stored as a variant in CastFilterValueListResult, replacing the separate value_list field. Migrate experimental_slice_layout to use In filter on track_id.
0f90ef9 to
2ed402d
Compare
When a column has an index and the query uses an In filter, the planner now emits IndexedFilterIn instead of the generic In bytecode. For each value in the list, IndexedFilterIn binary- searches the index permutation vector (O(log N) per value) and concatenates the matching ranges. This reduces In filter cost from O(N) to O(k log N + matches) where k is the number of values and N is the table size.
ad716eb to
c4641da
Compare
PrefixPopcount was emitted after IndexedFilterEq/In because alloc_popcount() was called inside the AddOpcode block. For SparseNull columns, this meant the popcount register was uninitialized when the filter executed, causing a SIGSEGV on LEFT JOINs over indexed columns. Move alloc_popcount() before AddOpcode so PrefixPopcount is emitted first.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
IndexedFilterInbytecode that uses binary search on index permutation vectors forInfiltersIn, the planner now emitsIndexedFilterIninstead of the genericInbytecodeInfilter cost from O(N) to O(k log N + matches) where k is the number of valuesStack
Test plan
IndexedFilterIn_Uint32_NonNull_MultipleValues,_NoMatch,_SingleValue,_String_SparseNull_MultipleValues)PlanQuery_SingleColIndex_InFilter_NonNullInt)TypedCursorInFilterWithIndex)