Read mapping is very slow on diploid human genome assembly

I tried to use VerityMap to validate a diploid human genome assembly using HiFi reads, but on my data it was too slow to be practical. I let it run for >3 weeks one 16 threads, and it only mapped up to about 4x. Is this speed expected? Are there any tweaks I can make to increase it?

The command I ran was

```
python3 main.py --reads reads.fastq.gz -o verity_map_output -t 16 -d hifi-diploid \
    assembly.haplotype1.fasta assembly.haplotype2.fasta
```

Another question/request: I understand from the paper that VerityMap also includes analysis modules to detect the location of misassemblies. As far as I can see, these can only be accessed after read mapping concludes (I believe the relevant code is [here](https://github.com/ablab/VerityMap/blob/master/veritymap/py_src/mapper.py#L150-L154)). Is this correct? It would be useful if the interface allowed a more modular option that could be run independently of mapping, especially since it seems like I will need to troubleshoot the mapping stage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read mapping is very slow on diploid human genome assembly #28

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Read mapping is very slow on diploid human genome assembly #28

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions