Skip to content

Underscore (ASCII 95) should be encoded to maintain reversibility #4

@swhiteman

Description

@swhiteman

It appears that a literal underscore (ASCII 95) _ is left as-is, meaning that a space (ASCII 32) and underscore are indistinguishable in the Q-encoded result.

Example:

var q = require("q-encoding")
var utf8 = require('utf8');
q.encode(utf8.encode("❤️ Hello _it's _me"));

Expected result:

"=E2=9D=A4=EF=B8=8F_Hello_=5Fit=27s_=5Fme"

Observed result:

"=E2=9D=A4=EF=B8=8F_Hello__it=27s__me"

Further example:

utf8.decode(q.decode(
  q.encode(utf8.encode("❤️ Hello _it's _me"))
));

Result:

"❤️ Hello  it's  me"

RFC 2047 4.2(3) notes

(3) 8-bit values which correspond to printable ASCII characters other
than "=", "?", and "_" (underscore), MAY be represented as those
characters.

So _ is a special purpose character in the encoding (representing a space), so a literal _ needs to be encoded as if it were non-ASCII.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions