Steps to reproduce
for following code tsgo and typescript generate differents token text
"🦀\ud7ff\ud800\ud801\uD83E\uDD80"
It seems tsgo using go string to store codePoint(from JS string),
|
func (f *NodeFactory) NewStringLiteral(text string) *Node { |
but JS string is not strict UTF16 string which may contain lone surrogate while go string will convert lone surrogate to U+FFFD which is a lossy conversion and lose the origin info
Behavior with typescript@5.8
🦀\ud7ff\ud800\ud801\uD83E\uDD80
https://ts-ast-viewer.com/#code/ESPg3AG7A6CuAmDsAzB0YA4AM6UYIzQCKoDMAogYesEA
Behavior with tsgo
https://rslint.rs/playground/?tab=ast&code=%22%F0%9F%A6%80%5Cud7ff%5Cud800%5Cud801%5CuD83E%5CuDD80%22
Steps to reproduce
for following code tsgo and typescript generate differents token text
It seems tsgo using go string to store codePoint(from JS string),
typescript-go/internal/ast/ast.go
Line 5813 in 0216862
but JS string is not strict UTF16 string which may contain lone surrogate while go string will convert lone surrogate to U+FFFD which is a lossy conversion and lose the origin info
Behavior with
typescript@5.8https://ts-ast-viewer.com/#code/ESPg3AG7A6CuAmDsAzB0YA4AM6UYIzQCKoDMAogYesEA
Behavior with
tsgohttps://rslint.rs/playground/?tab=ast&code=%22%F0%9F%A6%80%5Cud7ff%5Cud800%5Cud801%5CuD83E%5CuDD80%22