Conversation
ca30d2e to
17f6893
Compare
b290697 to
ebd1b41
Compare
…ter all puts finish
storeManifest bypassed putBlockInternal and wrote directly to repoDs, missing the onBlockStored callback. The block exchange engine never learned about the manifest block, causing remote downloads to time out.
…lot data, to avoid dropping slots while proving in process
86b49fa to
8316dd0
Compare
markspanbroek
left a comment
There was a problem hiding this comment.
Are all your changes going to be this big from now on? 😅
| # Add buffer past contract end so overlay data survives through the | ||
| # last proof window. Buffer covers period (last proof window) + | ||
| # proofTimeout (challenge window). Without this, maintenance can drop | ||
| # the overlay while the proving loop is still running its last proof. |
There was a problem hiding this comment.
I don't think that this is necessary. When the request has ended, submitting the proof will not work anyway. So it doesn't matter that generating the proof fails.
There was a problem hiding this comment.
yeah, as per our discussion, this is mostly harmless on the marketplace side, but it will trigger re-downloads of data that is potentially already gone, so it will either download something that is not longer needed, or get stuck in retries.
| let periodAtFinish = periodicity.periodOf(StorageTimestamp.init(clock.now())) | ||
| if periodAtFinish != provingPeriod: | ||
| warn "Failed to generate initial proof in time", | ||
| warn "Initial proof generated after period rollover", |
There was a problem hiding this comment.
I would like to keep the word "failed" in there. If the proof is not generated in time, it will submit it in the next period, and it will therefore be rejected by the smart contract, because the challenge has changed.
|
|
||
| trace "Deleting leaf and block metadata" | ||
|
|
||
| await self.deletingLock.drain(treeCid) |
There was a problem hiding this comment.
It looks as though you're trying to ensure that no-one is writing to the overlay by draining the deletingLock barrier. As far as I can tell, this is susceptible to a race condition.
In putLeafBlockMetaImpl(), you're first checking whether the overlay status equals Deleting before entering the barrier.
But here you first drain the barrier, then call several async operations, and only towards the end you set the overlay status to Deleting. In the mean time putLeafBlockMetaImpl() could have entered the barrier again.
There was a problem hiding this comment.
What the "barrier/inverse-semaphore" is doing here, is coordinating concurrent puts and deletes, the deleting flag is just an early exit marker for subsequent callers. But you're right that it might be safer to set it early.
| import pkg/archivist/marketplace/purchasing | ||
| import pkg/archivist/marketplace/purchasing/purchase | ||
| import pkg/archivist/marketplace/sales | ||
| import pkg/archivist/marketplace/availability/store | ||
| import pkg/archivist/marketplace/storageinterface | ||
| import pkg/archivist/marketplace/node {.all.} |
There was a problem hiding this comment.
This depends on a lot of internal details of the marketplace, which makes these tests rather fragile and complex. The marketplace interface should be relatively stable, and it seems that you only need for a call to marketplace.purchasing.purchase() to succeed, so I think it would make sense to introduce a simple mock for MarketplaceNode.
There was a problem hiding this comment.
Not really, this is the only place were we can test every component in unison, introducing a mock defeats the purpose of this test that has in practice proven to be extremely useful at identifying issues early.
| proc start*(builder: NodeBuilder): Future[Node] {.async.} = | ||
| var arguments: seq[string] | ||
| arguments.add("--disc-port=" & $(await builder.discoveryPortResolved)) | ||
| arguments.add("--nat=none") # don't need nat for integration tests |
There was a problem hiding this comment.
Best not make this change here, I handled this in #110:
There was a problem hiding this comment.
this shaves off around 10 minutes from the integration tests, otherwise it tries to initialize it on every startup.
bbecc05 to
1c000f6
Compare
Co-authored-by: markspanbroek <mark@spanbroek.net>
This PR fixes many issues related to quota handling and dataset cleanup, in addition to bumping the block size to 512kb.