Skip to content

Memory error from hdfeos5_2json_mbtiles.py for large files #4

@falkamelung

Description

@falkamelung

As described below, I get memory errors while trying to ingest big data files. Is there any way to reduce the memory requirements? If not, it would be good to display the limitations, e.g. how are the memory requirements calculated and, for a system with 64GB RAM, what is the maximum permitted file size?

The number of dates may be important. I believe I have previously successfully ingested larger files but with less dates.

I tried to ingest a 21GB data set (205 dates) using

hdfeos5_2json_mbtiles.py miaplpy_201505_202409_0.5/network_delaunay_4/S1_IW12_120_1183_1185_20150505_20240926_N00600_N00890_W078090_W077800_filtDel4DS.he5 miaplpy_201505_202409_0.5/network_delaunay_4/JSON_filtDS2 --num-workers 8

but got a memory error:

cat insarmaps_1131392.e
Process ForkPoolWorker-3:
Traceback (most recent call last):
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/queues.py", line 367, in get
    return _ForkingPickler.loads(res)
MemoryError
Process ForkPoolWorker-4:
Traceback (most recent call last):
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/pool.py", line 114, in worker
    task = get()
  File "/work2/05861/tg851601/stampede2/code/rsmas_insar/tools/miniforge3/lib/python3.10/multiprocessing/queues.py", line 367, in get
    return _ForkingPickler.loads(res)
MemoryError

I was on a really big machine (250 GB RAM). It came pretty far until the error occurred:

tail -5 insarmaps_1131392.o
converted chunk 889
converted chunk 894
converted chunk 899
converted chunk 905
converted chunk 91
               total        used        free      shared  buff/cache   available
Mem:           250Gi       207Gi        44Gi        20Gi        21Gi        43Gi
Swap:             0B          0B          0B

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions