Skip to content

Add snapshot fixes from polioland#360

Open
jonathanhhb wants to merge 9 commits into
mainfrom
add_snapshot_fixes_from_polioland
Open

Add snapshot fixes from polioland#360
jonathanhhb wants to merge 9 commits into
mainfrom
add_snapshot_fixes_from_polioland

Conversation

@jonathanhhb

@jonathanhhb jonathanhhb commented Mar 2, 2026

Copy link
Copy Markdown
Collaborator

Fix snapshot continuity bugs in LaserFrame and add continuity tests
Multi-segment simulations (save snapshot, reload, continue) had three bugs
in the core snapshot infrastructure that caused visible discontinuities:

  1. date_of_death offset: absolute-timestep death schedules were not adjusted
    when reloading into a new segment starting at t=0, causing agents to live
    too long (their deaths were scheduled far in the future relative to the
    new timeline). save_snapshot now accepts t= to
    automatically subtracts time steps elapsed from date_of_death.

  2. pop_final: results.pop[0] was reset to init_pop on reload, ignoring all
    births and deaths from the first segment and causing a population jump at
    the boundary. save_snapshot now accepts pop_final= (per-node population at
    snapshot time); load_snapshot returns it via pars["pop_final"] so callers
    can restore results.pop[0, :].

  3. keep_mask: save_snapshot now accepts an optional keep_mask boolean array
    to squash terminal-state agents (e.g. fully-recovered) before writing,
    reducing snapshot size without requiring callers to mutate the frame first.

All three params are optional and backwards-compatible. t_snap and pop_final
are surfaced in the returned pars dict for model-specific use.

Also adds:

  • 8 unit tests in test_laserframe.py covering round-trips, offset math,
    clamping, and absent-field behaviour
  • tests/test_snapshot_continuity.py: a model-agnostic continuity test built
    on a minimal deterministic ABM (fixed lifespan + constant birth rate) that
    directly exercises all three bugs and doubles as a runnable visual tool

Jonathan Bloedow and others added 2 commits March 2, 2026 09:10
Multi-segment simulations (save snapshot, reload, continue) had three bugs
in the core snapshot infrastructure that caused visible discontinuities:

1. date_of_death offset: absolute-timestep death schedules were not adjusted
   when reloading into a new segment starting at t=0, causing agents to live
   too long (their deaths were scheduled far in the future relative to the
   new timeline). save_snapshot now accepts t= to record t_snap; load_snapshot
   automatically subtracts t_snap from date_of_death and clamps to >= 1.

2. pop_final: results.pop[0] was reset to init_pop on reload, ignoring all
   births and deaths from the first segment and causing a population jump at
   the boundary. save_snapshot now accepts pop_final= (per-node population at
   snapshot time); load_snapshot returns it via pars["pop_final"] so callers
   can restore results.pop[0, :].

3. keep_mask: save_snapshot now accepts an optional keep_mask boolean array
   to squash terminal-state agents (e.g. fully-recovered) before writing,
   reducing snapshot size without requiring callers to mutate the frame first.

All three params are optional and backwards-compatible. t_snap and pop_final
are surfaced in the returned pars dict for model-specific use.

Also adds:
- 8 unit tests in test_laserframe.py covering round-trips, offset math,
  clamping, and absent-field behaviour
- tests/test_snapshot_continuity.py: a model-agnostic continuity test built
  on a minimal deterministic ABM (fixed lifespan + constant birth rate) that
  directly exercises all three bugs and doubles as a runnable visual tool

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 2, 2026 17:17

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates LaserFrame snapshot save/load to support multi-segment simulations without discontinuities, and adds regression tests to catch the previously observed boundary issues.

Changes:

  • Extend LaserFrame.save_snapshot() to optionally persist t_snap, pop_final, and apply an optional keep_mask.
  • Extend LaserFrame.load_snapshot() to surface t_snap/pop_final via pars and offset/clamp date_of_death when t_snap is present.
  • Add unit and integration-style tests to validate snapshot round-trips and continuity across a save/reload boundary.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
src/laser/core/laserframe.py Adds optional snapshot metadata (t_snap, pop_final) and continuity adjustment logic on load; introduces keep_mask handling.
tests/test_laserframe.py Adds unit tests for t_snap, pop_final, keep_mask, and date_of_death offset/clamping behavior.
tests/test_snapshot_continuity.py Adds a deterministic staged-vs-flat continuity test to detect boundary discontinuities.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/laser/core/laserframe.py
Comment thread src/laser/core/laserframe.py
Comment thread src/laser/core/laserframe.py Outdated
Comment thread tests/test_snapshot_continuity.py Outdated
jonathanhhb and others added 3 commits March 2, 2026 10:01
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_snapshot_continuity.py Outdated
Comment thread src/laser/core/laserframe.py Outdated
Comment on lines +296 to +302
to >= 1). The live frame is not mutated.
pop_final (np.ndarray, optional): 1-D array of final population counts per node
at the snapshot boundary. Returned via pars["pop_final"] on load so the
caller can restore results.pop[0, :] for a continuous population time-series.
keep_mask (np.ndarray, optional): Boolean array of shape (count,). If provided,
squash(keep_mask) is called before writing to compact terminal-state agents
out of the snapshot (e.g. fully-recovered agents that no longer affect dynamics).

Copilot AI Mar 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring says the live frame is not mutated, but when keep_mask is provided this method calls self.squash(keep_mask), which mutates the frame (shrinks count and compacts scalar arrays). Please document this side effect explicitly (or clarify that only the date_of_death offset is non-mutating) so callers don’t assume save_snapshot() is safe to call mid-simulation without changing in-memory state.

Suggested change
to >= 1). The live frame is not mutated.
pop_final (np.ndarray, optional): 1-D array of final population counts per node
at the snapshot boundary. Returned via pars["pop_final"] on load so the
caller can restore results.pop[0, :] for a continuous population time-series.
keep_mask (np.ndarray, optional): Boolean array of shape (count,). If provided,
squash(keep_mask) is called before writing to compact terminal-state agents
out of the snapshot (e.g. fully-recovered agents that no longer affect dynamics).
to >= 1). This offsetting is applied only to the data written to disk;
the in-memory 'date_of_death' property on the live frame is not mutated.
pop_final (np.ndarray, optional): 1-D array of final population counts per node
at the snapshot boundary. Returned via pars["pop_final"] on load so the
caller can restore results.pop[0, :] for a continuous population time-series.
keep_mask (np.ndarray, optional): Boolean array of shape (count,). If provided,
``squash(keep_mask)`` is called on this LaserFrame before writing, which
mutates the live frame in-place by shrinking ``_count`` and compacting
1-D scalar numpy arrays. Callers should not assume that the frame remains
unchanged when a ``keep_mask`` is supplied.

Copilot uses AI. Check for mistakes.
Comment thread src/laser/core/laserframe.py Outdated
Comment thread src/laser/core/laserframe.py Outdated
Comment on lines +310 to +319
self._save(f, "people")

# Write date_of_death already offset to the new segment's timeline so
# load_snapshot receives clean, ready-to-use values with no post-load fixup.
# We overwrite the dataset written by _save without touching the live frame.
if t is not None and "date_of_death" in self._properties:
dod = self._properties["date_of_death"]
adjusted = np.maximum(dod[: self._count].astype(np.int64) - t, 1).astype(dod.dtype)
del f["people"]["date_of_death"]
f["people"].create_dataset("date_of_death", data=adjusted)

Copilot AI Mar 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

save_snapshot() writes the full date_of_death dataset via _save(), then immediately deletes and recreates it when t is provided. For large frames this doubles I/O and memory for that column. Consider avoiding the initial write for date_of_death (e.g., allow _save() to skip/override specific properties) so the adjusted values are written once.

Copilot uses AI. Check for mistakes.
Comment thread tests/test_snapshot_continuity.py Outdated
Comment on lines +121 to +124
# Restore pop[0] of seg2 from pop_final so the stitched series is continuous
if "pop_final" in pars:
pop_seg2[0] = int(pars["pop_final"].sum())

Copilot AI Mar 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_pop_final_continuity() currently doesn’t actually validate the pop_final plumbing: pop_seg2[0] is initialized from frame2.count, which already equals frame1.count at the boundary, so the assertion would still pass even if pop_final were never saved/loaded/used. To exercise the bug described in the PR, simulate a caller that resets the segment-2 population baseline incorrectly (e.g., set pop_seg2[0] to INIT_POP/0 before applying pop_final), and assert that continuity is only restored when pars["pop_final"] is present and applied.

Copilot uses AI. Check for mistakes.
Jonathan Bloedow and others added 4 commits March 2, 2026 11:20
Several improvements following code review:

- save_snapshot: eliminate double-write of date_of_death by computing the
  t-offset array before opening the HDF5 file and passing it as an override
  to _save(), which writes each field exactly once. Previously the raw value
  was written by _save() then immediately deleted and rewritten, doubling I/O
  for that column on large frames.

- save_snapshot docstring: clarify that the t= offset touches only the data
  written to disk (in-memory frame unchanged), and explicitly document that
  keep_mask mutates the frame via squash() so callers are not surprised.

- test_snapshot_continuity: replace vacuous test_pop_final_continuity with one
  that actually exercises the plumbing — simulates a model incorrectly resetting
  its pop baseline to INIT_POP on reload, asserts that is wrong, then shows
  pars["pop_final"] corrects it.

- test_snapshot_continuity: add test_keep_mask_staged_run to cover the third
  save_snapshot feature that was missing from continuity-level testing.

- test_snapshot_continuity: expand module docstring with background on the
  three bugs being guarded against (previously referenced an external file
  not present in the repo), update coverage summary, fix stale comment that
  attributed the date_of_death offset to load_snapshot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jonathanhhb jonathanhhb requested a review from clorton March 3, 2026 18:52

@clorton clorton left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like us to implement a base/core save/load to HDF5 which we then wrap with all the special cases (date_of_death offset, patches restore along with people, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants