-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
So I'm trying to use AWRY again to include it in an FM-index benchmark, but I must say that it's rather annoying to use and I'm close to giving up... My user journey for trying it on a human genome.
- The input must be from disk, so I have to write my data to a file first.
- Actually, it must be a fasta file, so I have to prepend it with
>\n. - Ah but it must be ASCII, not u8 integer values 0/1/2/3. (ok that's on me though)
- Ah but
libsufrwrites in$TMPDIRwhich is/tmpwhich is tmpfs/RAM for me, and libsufr needs more than the 32GB available there. - Ok I create a local directory
tmpand putexport TMPDIR=tmp, but now this causes compile time errors because rust tries creatingsufr/libsufr/tmp/...andAWRY/tmp/...instead ofmy_crate/tmp/.... - So I build without the env var, and run with the env var.
- It then uses ~40GB disk, and 17GB RAM for the most time. That's fine.
- Then, after 11 minutes, I think after suffix array construction is done, RAM suddenly spikes to 32 GB and it OOMs.
- I do have 64 GB of RAM, but it turns out 32GB of that is taken by temporary files that are still lingering in /tmp (so add clarification to github actions to only test on main branch. #6 is still not really fixed apparently?).
- So now I'm going to have to wait 11min again for a new attempt at construction, which may or may not end up failing.
In fact, I already have the BWT on disk, but I don't think there's a way for me to directly give that to AWRY?
That would have saved me a lot of pain here. It's also not quite clear if this option to keep/remove the temporary suffix array actually would reuse it between runs.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels