cardano-testnet | Refactor EpochStateView to use TVar instead of IORef#6538
Conversation
1d2609d to
0b7d0eb
Compare
palas
left a comment
There was a problem hiding this comment.
Nice, we don't have to worry about the polling interval anymore 👍
| state <- readTVar epochStateView | ||
| case state of | ||
| Right _ -> pure state | ||
| Left (EpochStateFoldError _) -> pure state | ||
| Left EpochStateNotInitialised -> do | ||
| timedOut <- readTVar timedOutVar | ||
| guard timedOut | ||
| pure state |
There was a problem hiding this comment.
| state <- readTVar epochStateView | |
| case state of | |
| Right _ -> pure state | |
| Left (EpochStateFoldError _) -> pure state | |
| Left EpochStateNotInitialised -> do | |
| timedOut <- readTVar timedOutVar | |
| guard timedOut | |
| pure state | |
| state <- readTVar epochStateView | |
| timedOut <- readTVar timedOutVar | |
| guard (state /= Left LeftEpochStateNotInitialised || timedOut) | |
| pure state |
I am just showing off
There was a problem hiding this comment.
nice! We'd need some new Eq instances for that, but we'd have to write them manually for some ledger types.
0b7d0eb to
f872c06
Compare
There was a problem hiding this comment.
Pull request overview
Refactors cardano-testnet’s EpochStateView to use STM (TVar) so testnet wait/retry helpers can block on state updates instead of polling, improving responsiveness and reducing unnecessary wakeups.
Changes:
- Replaced
EpochStateView’sIORefstorage withTVarand introduced a monotonic version counter for update notification. - Updated retry logic to await epoch-state updates (STM) rather than
threadDelaypolling. - Added changelog fragments documenting the refactor; normalized JSON fixture formatting (newline at EOF).
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
cardano-testnet/src/Testnet/Components/Query.hs |
Switches epoch state storage to STM and adds STM-based update waiting used by retry loops. |
cardano-testnet/test/cardano-testnet-test/files/calculatePlutusScriptCost.json |
Formatting-only change (newline/EOF normalization). |
cardano-testnet/changelog.d/20260416_160000_mgalazyn_stm_epoch_state_view.md |
Documents the STM refactor and change-notification approach. |
cardano-testnet/changelog.d/20260416_124922_mgalazyn_refactor_retry_functions.md |
Updates changelog wording to match the refactor history. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| awaitStateUpdateTimeout timeout EpochStateView{epochStateVersion, epochStateView} = liftIO $ do | ||
| version <- readTVarIO epochStateVersion | ||
| timedOutVar <- registerDelay . round $ timeout * 1_000_000 | ||
| atomically $ do | ||
| v <- readTVar epochStateVersion | ||
| timedOut <- readTVar timedOutVar | ||
| case (v /= version, timedOut) of | ||
| (True, _) -> Just <$> readTVar epochStateView | ||
| (_, True) -> pure Nothing | ||
| _ -> STM.retry |
There was a problem hiding this comment.
Thanks. Flagging this as investigated but not acted on, with reasoning:
Mechanism: registerDelay doesn't spawn a thread per call.
It registers an entry in GHC's IO-manager timer queue (one long-lived manager per capability), so the accumulating cost is a priority-queue entry plus a TVar Bool per pending timeout, not a forked thread.
Scale: in steady-state retry loops (awaitStateUpdate fires roughly once per epoch; fallback is 300s), we'd have tens of pending entries at any time, low KB of memory.
Negligible at testnet-test scale.
Load-bearing: the 300s fallback here isn't redundant with retryUntilRightM's outer deadline.
That outer deadline is computed from getCurrentEpochNo/getSlotNumber/getBlockNumber, which read the same EpochStateView TVar that the fallback guards.
If the background fold thread hangs, the outer deadline stops advancing and the retry loop would block forever without this inner timeout.
Cancellable alternative: wrapping atomically in System.Timeout.timeout would cancel the timer on early return, but adds forkIO/killThread per call.
Trading one small overhead for another without a clear win.
Addressed the fast-path timer in getEpochStateDetails (line 323 comment) since that one accumulates without bound.
f872c06 to
e839e00
Compare
6bd49c2 to
82f0f20
Compare
82f0f20 to
75aece7
Compare
d9ddb0e to
2a7327a
Compare
There was a problem hiding this comment.
Pull request overview
Refactors EpochStateView in cardano-testnet to use STM (TVar) and introduces an STM-based change-notification mechanism so retry loops can block on state updates rather than polling.
Changes:
- Replaced
IORef-backedEpochStateViewwithTVarplus a monotonic version counter, and updated the background fold thread to write via STM. - Updated retry logic (
retryUntilRightM) and epoch-state getters to block via STM (awaitStateUpdate) instead ofthreadDelaypolling. - Added changelog entries and normalized a JSON test fixture (likely newline/formatting).
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cardano-testnet/src/Testnet/Components/Query.hs | Core refactor: EpochStateView now uses TVar + version counter; retry loops block on STM updates. |
| cardano-testnet/changelog.d/20260416_160000_mgalazyn_stm_epoch_state_view.md | Documents the STM-based EpochStateView refactor and version counter rationale. |
| cardano-testnet/changelog.d/20260416_124922_mgalazyn_refactor_retry_functions.md | Adjusts changelog wording to reflect the new EpochStateView shape. |
| cardano-testnet/test/cardano-testnet-test/files/calculatePlutusScriptCost.json | Formatting-only tweak (appears to fix final newline/line accounting). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cdf2031 to
35413fc
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors cardano-testnet’s EpochStateView to use STM (TVar) instead of IORef, enabling retry loops to block on state changes rather than polling, and updates documentation/changelog artifacts accordingly.
Changes:
- Replaced
IORef-backedEpochStateViewwithTVarplus a monotonic version counter for STM-based change notification. - Updated
retryUntilRightMto wait via STM (awaitStateUpdateTimeout) instead ofthreadDelaypolling. - Updated changelog entries and a JSON test fixture formatting/newline.
Reviewed changes
Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cardano-testnet/src/Testnet/Components/Query.hs | Refactors EpochStateView to TVar, adds versioning + STM wait helpers, and updates retry behavior to block on state updates. |
| cardano-testnet/test/cardano-testnet-test/files/calculatePlutusScriptCost.json | Minor formatting/newline adjustment in a test fixture. |
| cardano-testnet/changelog.d/20260416_160000_mgalazyn_stm_epoch_state_view.md | Documents the STM-based EpochStateView refactor and new notification mechanism. |
| cardano-testnet/changelog.d/20260416_124922_mgalazyn_refactor_retry_functions.md | Updates a prior changelog entry to reflect the finalized EpochStateView field removal wording. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
35413fc to
7db627e
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors cardano-testnet’s EpochStateView to use STM primitives (TVar) rather than IORef, enabling retry/wait loops to block on state changes instead of polling with threadDelay.
Changes:
- Replaced
EpochStateView’s underlying mutable storage fromIOReftoTVar, and added a monotonicepochStateVersionTVar. - Updated
retryUntilRightMto await STM-based state updates (with a fallback timeout) instead of polling. - Added/updated changelog entries documenting the maintenance refactor.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cardano-testnet/src/Testnet/Components/Query.hs | Migrates epoch-state tracking to STM (TVar) with a version counter and STM-based wakeups in retry loops. |
| cardano-testnet/changelog.d/20260416_160000_mgalazyn_stm_epoch_state_view.md | Documents the STM refactor and the new change-notification mechanism. |
| cardano-testnet/changelog.d/20260416_124922_mgalazyn_refactor_retry_functions.md | Updates prior changelog entry to reflect removal of unused EpochStateView fields. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
010681d to
4499504
Compare
75aece7 to
47aa730
Compare
Jimbo4350
left a comment
There was a problem hiding this comment.
LGTM, just two comments to address. I am opening a PR to document this a bit more as the complexity increases with using STM.
| -> AnyNewEpochState | ||
| -> Maybe EpochNo | ||
| maybeExtractGovernanceActionExpiry txid (AnyNewEpochState sbe newEpochState _) = | ||
| caseShelleyToBabbageOrConwayEraOnwards |
There was a problem hiding this comment.
Can we remove the use of caseShelleyToBabbageOrConwayEraOnwards?
There was a problem hiding this comment.
will make a follow-up PR
47aa730 to
344c313
Compare
7bcedd3 to
5cbd71a
Compare
5cbd71a to
a205cac
Compare
Checklist
See Running tests for more details
CHANGELOG.mdfor affected package.cabalfiles are updatedhlint. See.github/workflows/check-hlint.ymlto get thehlintversionstylish-haskell. See.github/workflows/stylish-haskell.ymlto get thestylish-haskellversionghc-9.6andghc-9.12Note on CI
If your PR is from a fork, the necessary CI jobs won't trigger automatically for security reasons.
You will need to get someone with write privileges. Please contact IOG node developers to do this
for you.