Hi,
I am opening this issue to start a discussion about the settings and computational limits of the methods. I added some of the email comments in there, feel free to edit!
Evaluation budget
There will be a fixed evaluation budget that each method can expend in its own way. Suggested budget: 500,000 evaluations.
Things to consider:
- implementations take different amounts of time (depending on tree size as well)
- some methods use local search
- some methods use minibatch sampling
- some methods aren't EC-based
It seems reasonable to take into account local search iterations and adjust for minibatch sampling.
It might also be interesting to monitor the load average on the cluster, if you can isolate it on a per-method basis, maybe combine with other measurements like memory usage. This would give a more general measure of each method's computational requirements.
Hyper-parameters
- six total combos, for consistency with the first benchmarks paper and to ensure reasonable computational costs.
Model complexity
Not much to comment here, maybe just a minor nit pick: I noticed some methods also use the "AQ" (analytical quotient) symbol, which can be decomposed into basic math operations: aq(a,b) = a / sqrt(1 + b^2). What is then the complexity of the AQ symbol?
Hi,
I am opening this issue to start a discussion about the settings and computational limits of the methods. I added some of the email comments in there, feel free to edit!
Evaluation budget
There will be a fixed evaluation budget that each method can expend in its own way. Suggested budget: 500,000 evaluations.
Things to consider:
It seems reasonable to take into account local search iterations and adjust for minibatch sampling.
It might also be interesting to monitor the load average on the cluster, if you can isolate it on a per-method basis, maybe combine with other measurements like memory usage. This would give a more general measure of each method's computational requirements.
Hyper-parameters
Model complexity
Not much to comment here, maybe just a minor nit pick: I noticed some methods also use the "AQ" (analytical quotient) symbol, which can be decomposed into basic math operations:
aq(a,b) = a / sqrt(1 + b^2). What is then the complexity of the AQ symbol?