[refactor] [misc] Refactoring benchmark code for performance monitoring#3269
Conversation
|
❌ Deploy Preview for jovial-fermat-aa59dc failed. 🔨 Explore the source changes: be2d045 🔍 Inspect the deploy log: https://app.netlify.com/sites/jovial-fermat-aa59dc/deploys/6177d9b574070d0007f19b4e |
| self.data_size = dsize | ||
| self.min_time_in_us = [] | ||
| self.results_evaluation = results_evaluation | ||
| self.data_size = dsize #list |
There was a problem hiding this comment.
Using type annotation in python can save you from doing these comments and (some) manual checks ;)
https://docs.python.org/3/library/typing.html
| return pow(product, 1.0 / len(data_array)) | ||
|
|
||
|
|
||
| def repeat_times(arch, datasize, repeat=1): |
There was a problem hiding this comment.
This function name isn't very informative ;) It's a bit unknown what I should expect when I pass in a repeat to repeat_times function :P
| self.min_time_in_us.append( | ||
| self.func(self.arch, self.test_dtype, test_dsize, | ||
| MemoryBound.basic_repeat_times)) | ||
| time.sleep(0.2) |
There was a problem hiding this comment.
just for my own understanding ;/, why do we need sleep 0.2 here?
There was a problem hiding this comment.
The idea was to give the device a cooling time, but here sleep(0.2s) is quite arbitrary.
Theoretically, we just need to make sure that this condition is consistent for each benchmark test.
Perhaps we can remove sleep(0.2s) to avoid performance fluctuations caused by subsequent changes.
There was a problem hiding this comment.
ah, ok, good to know. Thanks. The time is probably HW and workload dependent, hard to quantify.
qiao-bo
left a comment
There was a problem hiding this comment.
thanks, looks good. I wonder how are memory bound cases defined for ti.cpu? are operations like fill and saxpy still memory bound as in ti.cuda.
Good idea! We should be careful to divide the benchmark cases into suites. After all, there are devices with different computational intensities. |
Related issue = #3220