WSI inference crashes at ~76-100% of predicted tiles due to SlideLoader subprocess teardown destabilizing the multiprocessing Manager

**Environment**
Python 3.13.13
Platform: Linux HPC (SLURM)
Single GPU inference

**Description**
When running classpose-predict-wsi on large WSI files (~8600 tiles), the process consistently crashes after tile loading completes but while the worker is still processing tiles. The crash produces no Python traceback and exits with code 143 (SIGTERM) or 1, always at the same relative progress point (~76-85% of predicted tiles).

**Root cause**
The SlideLoader runs as a subprocess sharing a multiprocessing.Manager server with the main process and PostProcessor. When SlideLoader finishes filling the queue and its subprocess exits naturally, Python cleanup of the Manager proxies it holds destabilizes the shared Manager server. The worker (running in the main process) is still actively using Manager-proxied objects (slide.n, predicted_tiles_value, pp.q) at this point, causing a silent crash.

**Workaround**
Pre-loading all tiles into a local list before starting inference eliminates the concurrency between SlideLoader and the worker:

```python
#After slide initialization, drain the queue fully before starting inference
all_tiles = []
while True:
    item = slide.q.get()
    if item[0] is None:
        break
    all_tiles.append(item)
slide.p.join()  # SlideLoader is fully done before worker starts
```
Then feed all_tiles into a plain queue.Queue for the worker. This ensures the SlideLoader subprocess is completely finished before inference begins, so its teardown cannot affect the Manager.

**Additional notes**

The pp.polygons.empty() check in the polygon collection loop is also unreliable for managed queues on large slides (can return True prematurely with ~250k cells). Draining by count (pp.value.value) or writing directly to a file from the PostProcessor subprocess is more robust.
Reproducible across multiple different SVS files and hardware configurations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WSI inference crashes at ~76-100% of predicted tiles due to SlideLoader subprocess teardown destabilizing the multiprocessing Manager #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

WSI inference crashes at ~76-100% of predicted tiles due to SlideLoader subprocess teardown destabilizing the multiprocessing Manager #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions