Fix string escapes. (issue #37)#54
Open
ntdaley wants to merge 2 commits intozaach:masterfrom
Open
Conversation
Previously the different escape characters were being applied one after the other, so that "\\n" in the json will get turned into a new-line instead of "\n". Previously unicode escapes were not being converted resulting in stringifying the parse result turning "\u20AC" getting turned into "\\u20AC". Previously "\/" was not being converted resulting in stringifying the parse result turning it into "\\/" Note: I removed handling of '\v' because this would not be accepted by the lexer anyway, and is not part of the json standard. Note: Because unicode escapes are converted, strings in the input like "\u20AC" will become their unicode equivalent after parsing (e.g. in this case the euro character). Also changed the command line use of JSON.stringify to further process the result to convert non-ASCII printable characters to unicode escapes. While not strictly necessary according to the JSON standard, ascii output is safer for some parsers, and now that the parser processes unicode escapes there is more chance of having non-ASCII characters in the parser output. I would suggest that it would be better to always use formatter.js instead of JSON.stringify, because that way the choice between unicode escaped values and unicode characters would always be the same for input and output. Similar formatting changes should probably be made in the web version.
|
@zaach any chance to look at this please? Is the project dead? |
russaa
added a commit
to mmig/jsonlint-pos
that referenced
this pull request
Jun 24, 2020
zaach#54 from ntdaley:master with commits: #ecf1830f21634f2b711b4fd840789ec8ddf01649 #aaf81b140f12cfa20ba9411770fa26f665ba6010
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously the different escape characters were being applied one after the other, so that
"\\n"in the json will get turned into a new-line instead of"\n".Previously unicode escapes were not being converted resulting in stringifying the parse result turning
"\u20AC"getting turned into"\\u20AC".Previously
"\/"was not being converted resulting in stringifying the parse result turning it into"\\/"Note: I removed handling of
"\v"because this would not be accepted by the lexer anyway, and is not part of the json standard.Note: Because unicode escapes are converted, strings in the input like
"\u20AC"will become their unicode equivalent after parsing (e.g. in this case the euro character).Also changed the command line use of
JSON.stringifyto further process the result to convert non-ASCII printable characters to unicode escapes. While not strictly necessary according to the JSON standard, ascii output is safer for some parsers, and now that the parser processes unicode escapes there is more chance of having non-ASCII characters in the parser output.I would suggest that it would be better to always use formatter.js instead of
JSON.stringify, because that way the choice between unicode escaped values and unicode characters would always be the same for input and output. Similar formatting changes should probably be made in the web version.