Skip to content
This repository was archived by the owner on Nov 2, 2021. It is now read-only.
This repository was archived by the owner on Nov 2, 2021. It is now read-only.

shorten qname in output bam file in --rmdup-only mode #22

@frankyan

Description

@frankyan

Hi, I found that when in "--rmdup-only" mode, the qname (read name) of each line in output bam file was shorten by the number of (len_dupindex + 1). For example:

# original STAR mapping bam:
MN00727:7:000H2HJHJ:1:12110:12180:16802 16      chr10   63493   255     71M
# the dedup output:
MN00727:7:000H2HJHJ:1:12110:12  16      chr10   63493   255     71M

I checked the source code and found the bug. It's in line 307 and 308, which apply UMIStripFunc by _get_sam_str two times to sam_row.

sam_str = self._get_sam_str(sam_row)
self._torm.stdin.write(self._get_sam_str(sam_row))

I have pulled a request to fix this problem #23 . I used single end sequencing data. I did't test paired-end data.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions