Skip to content

Refactor Run modules to use modern record extensions#830

Open
mwihoti wants to merge 6 commits intoIntersectMBO:mainfrom
mwihoti:feature/811-modern-record-extensions
Open

Refactor Run modules to use modern record extensions#830
mwihoti wants to merge 6 commits intoIntersectMBO:mainfrom
mwihoti:feature/811-modern-record-extensions

Conversation

@mwihoti
Copy link
Copy Markdown
Contributor

@mwihoti mwihoti commented Mar 25, 2026

Description

Following the successful pilot with Internal.Arena, this PR refactors the Run module and several related components to use the modern record language extensions.

changes

  • Record Modernization: Refactored the Run record fields to use shorter, prefix-free names (e.g., runIndex - index, runFilter - bloomFilter), which significantly improves the developer experience with dot notation.
  • Test Suite Cleanup: Extensively updated the StateMachine and Lookup test suites to use dot notation instead of traditional field accessors.

cabal build lsm-tree:lib:core -- Builds successfully
cabal test lsm-tree-test --pattern "Run" -- 129/129 tests pass
cabal test all -- Verified for compilation and linking

link
to the issue.

Checklist

  • Read our contribution guidelines at CONTRIBUTING.md, and make sure that this PR complies with the guidelines.

@mwihoti
Copy link
Copy Markdown
Contributor Author

mwihoti commented Mar 25, 2026

Hi @jorisdral , I just pushed an update for Run module with the new record language extensions, I'm moving next to Index

Copy link
Copy Markdown
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the contribution, I like the new field names for the Run and RunReader module 😄.

Two high-level comments:

  • In modules where we refactor field names (Run/RunReader), we should enable DuplicateRecordFields, NoFieldSelectors, and OverloadedRecordDot. In other modules where we only want to use the .field syntax, it should be fine to only enable OverloadedRecordDot, so let's change that so that we only use the .field syntax on values of type Run and RunReader. That probably makes the diff of this PR also smaller
  • There are some residual parentheses here and there. In particular, we don't need parentheses around syntax like value.field

Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Run.hs Outdated
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Run.hs Outdated
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Run.hs Outdated
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Run.hs
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Run.hs Outdated
Comment thread lsm-tree/test/Test/Database/LSMTree/StateMachine.hs Outdated
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/MergeSchedule.hs Outdated
Comment thread lsm-tree/src-core/Database/LSMTree/Internal/Config/Override.hs Outdated
Comment thread lsm-tree/src/Database/LSMTree/Simple.hs Outdated
Comment thread lsm-tree/test/Database/LSMTree/Class.hs Outdated
@mwihoti
Copy link
Copy Markdown
Contributor Author

mwihoti commented Mar 26, 2026

Thanks, I will work on the changes

mwihoti added 2 commits April 3, 2026 09:00
  extension scope

  - Remove unnecessary parentheses around dot-access expressions (value.field)
  - Only enable DuplicateRecordFields/NoFieldSelectors in Run and RunReader
    where field names were actually refactored; other modules get
  OverloadedRecordDot only
@mwihoti mwihoti force-pushed the feature/811-modern-record-extensions branch from 343017b to 5823d35 Compare April 3, 2026 06:01
Snapshot.hs: use exported runFsPaths accessor instead of internal r.fsPaths
Unsafe.hs: restore parens around function applications
 Config/Override.hs: fix corrupted override instance for MergeBatchSize
Benchmarks: update to dot syntax with renamed fields (bloomFilter, index,
    kOpsFile) and add OverloadedRecordDot pragma to both benchmark files
@mwihoti
Copy link
Copy Markdown
Contributor Author

mwihoti commented Apr 3, 2026

Hi @jorisdral I've updated changes based on the feedback
image

@mwihoti
Copy link
Copy Markdown
Contributor Author

mwihoti commented Apr 15, 2026

Hello, @jorisdral I'would like to proceed next to index, could you check if what I've done aligns well

Copy link
Copy Markdown
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mwihoti thanks very much! I think the changes look good now. I have some remaining comments but they are only about formatting. Once those are resolved, we can merge this PR 😃

Comment on lines +388 to +390
let blooms = V.map (\(DeRef r) -> r.bloomFilter ) runs
indexes = V.map (\(DeRef r) -> r.index ) runs
handles = V.map (\(DeRef r) -> r.kOpsFile ) runs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let blooms = V.map (\(DeRef r) -> r.bloomFilter ) runs
indexes = V.map (\(DeRef r) -> r.index ) runs
handles = V.map (\(DeRef r) -> r.kOpsFile ) runs
let blooms = V.map (\(DeRef r) -> r.bloomFilter) runs
indexes = V.map (\(DeRef r) -> r.index ) runs
handles = V.map (\(DeRef r) -> r.kOpsFile ) runs

Comment on lines +95 to +97
env ( pure ( V.map (\(DeRef r) -> r.bloomFilter ) rs
, V.map (\(DeRef r) -> r.index ) rs
, V.map (\(DeRef r) -> r.kOpsFile ) rs
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
env ( pure ( V.map (\(DeRef r) -> r.bloomFilter ) rs
, V.map (\(DeRef r) -> r.index ) rs
, V.map (\(DeRef r) -> r.kOpsFile ) rs
env ( pure ( V.map (\(DeRef r) -> r.bloomFilter) rs
, V.map (\(DeRef r) -> r.index ) rs
, V.map (\(DeRef r) -> r.kOpsFile ) rs

rnf numEntries `seq` rwhnf refCounter `seq` rnf fsPaths `seq`
rnf bloomFilter `seq` rnf index `seq` rnf kOpsFile `seq`
rnf blobFile `seq` rnf dataCaching `seq` rwhnf hasFS `seq` rwhnf hasBlockIO

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines +236 to +238
numEntries = runNumEntries
, refCounter = refCounter
, fsPaths = runRunFsPaths
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
numEntries = runNumEntries
, refCounter = refCounter
, fsPaths = runRunFsPaths
numEntries = runNumEntries
, refCounter = refCounter
, fsPaths = runRunFsPaths

-> m ()
close RunReader{..} = do
when (readerRunDataCaching == Run.NoCacheRunData) $
close r = do -- Use 'r' instead of RunReader{..}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not resolved yet: remove the comment


{-# INLINE tableSessionSalt #-}
-- | Inherited from session for ease of access.
-- | Inherited from session for easenve of access.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-- | Inherited from session for easenve of access.
-- | Inherited from session for ease of access.

!wbblobs = tableWriteBufferBlobs content
wbblobs' <- withRollback reg (dupRef wbblobs) releaseRef
let runs = cachedRuns (tableCache content)
let runs = cachedRuns(tableCache content)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let runs = cachedRuns(tableCache content)
let runs = cachedRuns (tableCache content)

traceWith childTableTracer $ TraceIncrementalUnions (NE.map tableId ts)
childTableId <- uniqueToTableId <$> incrUniqCounter sesh.sessionUniqCounter
let childTableTracer = TraceTable childTableId `contramap` sesh.lsmTreeTracer
traceWith childTableTracer $ TraceIncrementalUnions (NE.map (\t -> t.tableId) ts)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
traceWith childTableTracer $ TraceIncrementalUnions (NE.map (\t -> t.tableId) ts)
traceWith childTableTracer $ TraceIncrementalUnions (NE.map (.tableId) ts)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are redundant parentheses around record-dot syntax in this module

Comment on lines +1222 to +1224
(V.mapStrict (\(DeRef r) -> r.bloomFilter) runs)
(V.mapStrict (\(DeRef r) -> r.index ) runs)
(V.mapStrict (\(DeRef r) -> r.kOpsFile) runs)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(V.mapStrict (\(DeRef r) -> r.bloomFilter) runs)
(V.mapStrict (\(DeRef r) -> r.index ) runs)
(V.mapStrict (\(DeRef r) -> r.kOpsFile) runs)
(V.mapStrict (\(DeRef r) -> r.bloomFilter) runs)
(V.mapStrict (\(DeRef r) -> r.index ) runs)
(V.mapStrict (\(DeRef r) -> r.kOpsFile ) runs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants