Point at invalid utf-8 span on user's source code#135557
Point at invalid utf-8 span on user's source code#135557bors merged 1 commit intorust-lang:masterfrom
Conversation
|
r? @fee1-dead rustbot has assigned @fee1-dead. Use |
|
Some changes occurred in src/tools/compiletest cc @jieyouxu |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| rdr.lines() | ||
| .enumerate() | ||
| // We want to ignore utf-8 failures in tests during collection of annotations. | ||
| .filter(|(_, line)| line.is_ok()) |
| let mut expected_revisions = BTreeSet::new(); | ||
|
|
||
| let contents = std::fs::read_to_string(test).unwrap(); | ||
| let Ok(contents) = std::fs::read_to_string(test) else { continue }; |
There was a problem hiding this comment.
...this, and another line in md-doc blow up if a test has non-utf-8 bytes.
I ended up removing a test of ".rs file with invalid few utf-8 bytes" because md-docs is in a separate repo. I feel we should make the entire test suite more resilient to these, but I think I can make that test I added and removed be in run-make...
| let source_file = psess.source_map().load_file(path).unwrap_or_else(|e| { | ||
| let msg = format!("couldn't read {}: {}", path.display(), e); | ||
| let msg = format!("couldn't read `{}`: {}", path.display(), e); |
There was a problem hiding this comment.
the error here could be an enum of an io error or a utf8 error, and store the information about the span and message to be reported, and that would probably help with deduplicating these code
| @@ -210,8 +210,34 @@ pub(crate) fn expand_include_str( | |||
| MacEager::expr(cx.expr_str(cx.with_def_site_ctxt(bsp), interned_src)) | |||
There was a problem hiding this comment.
maybe worth looking into unifying the logic here and the rustc_parse logic somehow?
There was a problem hiding this comment.
I moved the logic to a separate function and called it from both places.
This comment has been minimized.
This comment has been minimized.
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: `[193]` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character.
Fix rust-lang#76869.
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: byte `193` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
CC rust-lang#76869, rust-lang#135557.
|
@bors r=fee1-dead |
Point at invalid utf-8 span on user's source code
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: byte `193` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character.
Fix rust-lang#76869.
Rollup of 8 pull requests Successful merges: - rust-lang#135557 (Point at invalid utf-8 span on user's source code) - rust-lang#135596 (Properly note when query stack is being cut off) - rust-lang#135638 (Make it possible to build GCC on CI) - rust-lang#135648 (support wasm inline assembly in `naked_asm!`) - rust-lang#135826 (Misc. `rustc_resolve` cleanups) - rust-lang#135827 (CI: free disk with in-tree script instead of GitHub Action) - rust-lang#135850 (Update the `wasm-component-ld` tool) - rust-lang#135855 (Only assert the `Parser` size on specific arches) r? `@ghost` `@rustbot` modify labels: rollup
Point at invalid utf-8 span on user's source code
```
error: couldn't read `$DIR/not-utf8-bin-file.rs`: stream did not contain valid UTF-8
--> $DIR/not-utf8-2.rs:6:5
|
LL | include!("not-utf8-bin-file.rs");
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
note: byte `193` is not valid utf-8
--> $DIR/not-utf8-bin-file.rs:2:14
|
LL | let _ = "�|�␂!5�cc␕␂��";
| ^
= note: this error originates in the macro `include` (in Nightly builds, run with -Z macro-backtrace for more info)
```
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If *that* fails, we provide additional context about *where* the file has the first invalid UTF-8 character.
Fix rust-lang#76869.
…iaskrgr Rollup of 10 pull requests Successful merges: - rust-lang#132983 (Edit dangling pointers ) - rust-lang#133154 (Reword resolve errors caused by likely missing crate in dep tree) - rust-lang#135409 (Fix ICE-133117: multiple never-pattern arm doesn't have false_edge_start_block) - rust-lang#135557 (Point at invalid utf-8 span on user's source code) - rust-lang#135596 (Properly note when query stack is being cut off) - rust-lang#135794 (Detect missing fields with default values and suggest `..`) - rust-lang#135814 (ci: use ghcr buildkit image) - rust-lang#135826 (Misc. `rustc_resolve` cleanups) - rust-lang#135837 (Remove test panic from File::open) - rust-lang#135856 (Library: Finalize dyn compatibility renaming) r? `@ghost` `@rustbot` modify labels: rollup
|
oops wrong pr |
You had me scared for a second there "oh, no! what fresh hell did this utf-8 handling cause during rollup?" XD |
…iaskrgr Rollup of 9 pull requests Successful merges: - rust-lang#132983 (Edit dangling pointers ) - rust-lang#135409 (Fix ICE-133117: multiple never-pattern arm doesn't have false_edge_start_block) - rust-lang#135557 (Point at invalid utf-8 span on user's source code) - rust-lang#135596 (Properly note when query stack is being cut off) - rust-lang#135794 (Detect missing fields with default values and suggest `..`) - rust-lang#135814 (ci: use ghcr buildkit image) - rust-lang#135826 (Misc. `rustc_resolve` cleanups) - rust-lang#135837 (Remove test panic from File::open) - rust-lang#135856 (Library: Finalize dyn compatibility renaming) r? `@ghost` `@rustbot` modify labels: rollup
When we attempt to load a Rust source code file, if there is a OS file failure we try reading the file as bytes. If that succeeds we try to turn it into UTF-8. If that fails, we provide additional context about where the file has the first invalid UTF-8 character.
Fix #76869.