Fix StatisticsPanel pickling: serialize metrics by registry name, atomic writes#1
Open
Fix StatisticsPanel pickling: serialize metrics by registry name, atomic writes#1
Conversation
…mic writes
Resolves the long-standing Bug A in the spatialtissuepy MCP panel
infrastructure: summary_create_panel(..., preset=...) and
summary_add_metric(...) failed with
PicklingError: Can't pickle <function cell_counts at 0x...>:
it's not the same object as spatialtissuepy.summary.population.cell_counts
and, under parallel tool calls, a follow-on
UnpicklingError: Ran out of input
The two symptoms share one subsystem but have independent root causes.
## 1. MetricInfo.func is not its own module attribute
register() in registry.py stores the raw `func` passed to the decorator
on the MetricInfo object, but returns a `functools.wraps`-wrapped
function as the module attribute. Pickle resolves callables by
`__qualname__` to find them at unpickle time, sees the wrapper at
e.g. `spatialtissuepy.summary.population.cell_counts`, compares identity
against the stored raw func, mismatch, refuses to serialise.
Fix: teach MetricInfo to pickle by *name* via __reduce__. On dump we
store only the metric name; on load we call
_resolve_metric_for_pickle(name) which fetches the live registry entry.
That is the durable object across processes, not the captured function
reference at registration time. Built-in metrics register at import, so
all built-in panels round-trip without user action. Registered custom
metrics round-trip as long as the consumer re-registers them before
loading; if they don't, unpickle raises a descriptive RuntimeError
naming the missing metric rather than a cryptic identity-mismatch.
Inline metrics added via StatisticsPanel.add_custom_function() are
per-panel only and correctly refuse to pickle with a TypeError pointing
users at register_custom_metric.
## 2. Non-atomic writes produce truncated pickles under concurrency
SessionManager.store_panel and store_data used plain
`open(path, 'wb'); pickle.dump(...)`. Two concurrent MCP tool calls
(typical when a client issues parallel summary_add_metric calls) could
open the same file simultaneously -- whichever writer finished second
with less data truncated the file that the first had already filled,
and the next read saw "Ran out of input".
Fix: _atomic_pickle_dump writes to a per-call tmp file
(pid + thread id + uuid slug, so in-process threading doesn't collide
either) and then os.replace-s it into place. os.replace is atomic on
POSIX and on Windows ReplaceFile, so readers always see either the
previous file or a complete new one -- never a torn write. On write
failure the temp file is cleaned up so we don't leak partial bytes.
## Tests
Adds tests/test_summary_panel_pickle.py with 11 cases covering:
* All four preset panels (basic / spatial / neighborhood / comprehensive)
round-trip through pickle.
* A registered custom metric round-trips.
* An inline custom function refuses to pickle with a clear TypeError.
* A custom metric removed from the registry between dump and load
raises RuntimeError naming the missing metric.
* Eight concurrent threads writing different payloads to the same path
via _atomic_pickle_dump always leave a valid complete pickle on disk
(no truncation).
* A failed pickle attempt cleans up the temp file and leaves no partial
state behind.
* End-to-end: SessionManager.store_panel + load_panel round-trips a
basic built-in panel through a temp session directory.
Full test suite: 273 passed, 2 skipped, 0 regressions.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the long-standing Bug A in the panel/summary infrastructure where
summary_create_panel(..., preset=...)andsummary_add_metric(...)fail with:and, under parallel tool calls, a follow-on:
Two symptoms, two independent root causes in the same subsystem.
Root cause 1:
MetricInfo.funcis not the module attribute of the same qualnameregistry.register()stores the rawfuncon theMetricInfoobject but returns afunctools.wraps-wrapped function as the module attribute. Pickle resolves callables by__qualname__at unpickle time, finds the wrapper at e.g.spatialtissuepy.summary.population.cell_counts, compares identity against the stored raw func, mismatch → refuses to serialize.Fix:
MetricInfo.__reduce__now pickles by name and re-resolves from the live registry via_resolve_metric_for_pickle(name)on load. Built-in metrics register at import, so panels round-trip without user action. Registered custom metrics round-trip as long as the consumer re-registers before loading; if they don't, unpickle raises a descriptiveRuntimeErrornaming the missing metric instead of an identity-mismatch. Inline metrics added viaStatisticsPanel.add_custom_function()are per-panel only and correctly refuse to pickle with aTypeErrorpointing users atregister_custom_metric.Root cause 2: non-atomic writes produce truncated pickles under concurrency
SessionManager.store_panel/store_datausedopen(path, 'wb'); pickle.dump(...). Two parallel MCP tool calls can open the same file simultaneously — whichever writer finishes second with less data truncates the file the first had already filled, and the next read sees "Ran out of input".Fix:
_atomic_pickle_dumpwrites to a per-call temp file (pid + thread id + uuid slug, so in-process threading doesn't collide either) andos.replace-s it into place. Readers always see either the previous file or a complete new one — never a torn write. Failed writes clean up the temp file so no partial bytes leak.Tests
New
tests/test_summary_panel_pickle.py, 11 cases:basic,spatial,neighborhood,comprehensive) round-trip through pickle.TypeError.RuntimeErrornaming the missing metric._atomic_pickle_dumpalways leave a valid complete pickle on disk.SessionManager.store_panel+load_panelround-trips a basic built-in panel.Test plan
pytest tests/test_summary_panel_pickle.py— 11/11 pass.__reduce__semantics forMetricInfoand the atomic-write helper.Scope
No public API changes.
MetricInfo's call / repr / field surface is unchanged;SessionManager.store_panel/store_datakeep the same signature. The only new public callable is the module-level_resolve_metric_for_pickle(underscore-prefixed, used only as the pickle reconstruction target).