Rework set_suffix_array_mem method of SufrFile to avoid redundant re-loading#9
Merged
georgeglidden merged 5 commits intoTravisWheelerLab:mainfrom Apr 3, 2025
Merged
Conversation
…it not triggering?
…max_query_len to avoid None values causing re-loading on every call to set_suffix_array_mem
Optimize lowmem false
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Reworked the
set_suffix_array_memmethod of SufrFile (https://github.com/TravisWheelerLab/sufr/blob/main/libsufr/src/sufr_file.rs#L514C8-L514C28) to avoid redundant loading of the suffix array into memory:max_query_lenisNone, it defaults toself.text_len.to_usize(), which will be equal tobuilt_max_query_lenif the suffix array was also built withmax_query_lenset toNone. This means that every timeset_suffix_array_memis called, the condition on line 539 is triggered, leading to the entire suffix array being read from disk. If multiple queries are made with the Optionsmax_query_len: None, the suffix array will be read every single time.self.suffix_array_mem_mqlmatches the one from the Options. However,self.suffix_array_mem_mqlisNoneby default and only gets set when the suffix array gets read from a non-cached file.set_suffix_array_mem, the suffix array in memory will have a max query len matching the Options max query len, so the change I made setsself.suffix_array_mem_mql = Some(max_query_len); whenever the function does not return early.