Conversation
Rewrite of the Authentication-Results header parser for a complete RFC 8601 implementation. Ignore any header comments in parenthesis. Allow escape sequences and semi-colons in comments and quoted strings as values. Fixes: emersion#32
| case '\\': | ||
| c, _ = p.r.ReadByte() | ||
| comment += "\\" + string(c) | ||
| case '(': |
There was a problem hiding this comment.
the CFWS referenced by https://datatracker.ietf.org/doc/html/rfc8601#section-2.2 is defined in https://www.rfc-editor.org/rfc/rfc5322#section-3.2.2 (internet message / email):
FWS = ([*WSP CRLF] 1*WSP) / obs-FWS
; Folding white space
ctext = %d33-39 / ; Printable US-ASCII
%d42-91 / ; characters not including
%d93-126 / ; "(", ")", or "\"
obs-ctext
ccontent = ctext / quoted-pair / comment
comment = "(" *([FWS] ccontent) [FWS] ")"
CFWS = (1*([FWS] comment) [FWS]) / FWS
this code is not entirely future-proof, though right now, i welcome everything that helps me handle my emails..
even net/mail doesn't go the whole way:
The full range of spacing (the CFWS syntax element) is not supported, such as breaking addresses across lines.
how about including the parentheses in the output string (doing comment += string(c) for both ( and )) and leaving parsing of comments up to other or future implementations?
in that case, the backslashes should also be kept.
i can think of invalid comments that would be accepted by this parser. particularly, the restrictions for ctext.
the implementation before this pr was more restrictive than the grammar. this replacement is more liberal, which could be understood as a changing interface.
|
On Wed Aug 28, 2024 at 1:32 PM CEST, Leon Busch-George wrote:
this code is not entirely future-proof, though right now, i welcome everything that helps me handle my emails..
even [net/mail](https://pkg.go.dev/net/mail#pkg-overview) doesn't go the whole way:
> The full range of spacing (the CFWS syntax element) is not supported, such as breaking addresses across lines.
how about including the parentheses in the output string (doing `comment += string(c)` for both `(` and `)`) and leaving parsing of comments up to other or future implementations?
in that case, the backslashes should also be kept.
i can think of invalid comments that would be accepted by this parser. particularly, the restrictions for `ctext`.
the implementation before this pr was more restrictive than the grammar. this replacement is more liberal.
that change could be understood as a changing interface.
Thanks for your review! Happy to work on this again.
It would be good, though, if we get an indication from @emersion if he'd
accept a PR for this at all (and/or if there are other ideas regarding
the existing parser).
|
|
Just tested and I can confirm that it also fixes my issue: #74 |
Rewrite of the Authentication-Results header parser for a complete RFC
8601 implementation. Ignore any header comments in parenthesis. Allow escape
sequences and semi-colons in comments and quoted strings as values.
Fixes: #32