Skip to content
This repository was archived by the owner on Oct 30, 2023. It is now read-only.

Fix bug in memory estimation#49

Closed
dlogothetis wants to merge 2 commits intoapache:trunkfrom
dlogothetis:fix_mem_est
Closed

Fix bug in memory estimation#49
dlogothetis wants to merge 2 commits intoapache:trunkfrom
dlogothetis:fix_mem_est

Conversation

@dlogothetis
Copy link
Contributor

Method MemoryEstimatorOracle.calculateRegression() exits if the number of valid columns to use for the regression is not the same as the total number of columns. This is wrong, the regression can still run on only the valid columns. This causes memory estimation to never be used in practice, and OOC starts spilling only when memory usage gets very high.

This is fixed in #34 too, but I want to make these changes one-by-one so that we can test in isolation.

Tests:

  • mvn clean install
  • Snapshot tests, including snapshot test that uses OOC.
  • Run 3 production jobs and verified that this reduces data spills and jobs finish faster. The max % spilled is reduced by more than 40%.

JIRA: https://issues.apache.org/jira/browse/GIRAPH-1160

Loading
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants