In the code:
// read another part
memcpy((char *)dst+have_read, bip->data+reader->start, size - have_read);
have_read = size;
reader->start += size;
reader->start is incremented with size instead of being incremented with (size - have_read) as in the memcpy().