Improve memory access pattern and histogram clipping for CLAHE#81
Merged
Conversation
Contributor
Author
|
To be on the safe side, I added a test (as a beanshell script). @axtimwalde does this PR look good to you? I'm also happy to port all existing tests to JUnit (which adds an additional dependency). |
Owner
|
Looks great. Junit tests would be wonderful. Thanks! |
Contributor
Author
|
Done. I also ported the ringbuffer test since it was the only one that didn't heavily rely on manual exploration and a gui to test things. |
Owner
|
Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
While working on #80, I saw some potential for performance improvements in the "original" CLAHE method and made the following changes:
The changes introduce slight differences in the actual output values, but I verified that they stay within 1 intensity unit in an uint8 image. The following table shows the run times before and after the changes (best of 3 runs). The numbers are
width x height, blockRadius.In principle, the improved histogram clipping also affects the "fast" option. Since the histogram is not computed that often in this case, I saw only improvements of about 20% for small values of
blockRadiusand large images (i.e., many histograms to compute), but no significant speedup for the other cases.That being said, the ‘fast’ option is still orders of magnitude faster and should definitely be the go-to method unless it’s verifiably insufficient for the use case.
Let me know what you think @axtimwalde!