Document C string literal tokens.#1423
Conversation
|
Note: this feature is being stabilized in rust-lang/rust#117472 -- CI will fail until run with I ran |
ehuss
left a comment
There was a problem hiding this comment.
Thanks!
Can you also include a section that indicates that C-string literals are only available in Edition 2021 or newer? Edition differences are specified in blockquotes (search for "Edition Differences" for the format).
I believe the Reserved prefixes section needs to be updated with c and cr being excluded.
I believe Literal patterns will need to be updated since C-strings are accepted there syntactically. (They can't really be used since CStr doesn't implement Eq/PartialEq, though.)
5d19507 to
2481014
Compare
|
Thank you for the quick review! I've made the recommended changes, fixed the ASCII vs Unicode misunderstanding, and tried to clarify the wording around |
| A _C string literal_ is a sequence of Unicode characters and _escapes_, | ||
| preceded by the characters `U+0063` (`c`) and `U+0022` (double-quote), and | ||
| followed by the character `U+0022`. If the character `U+0022` is present within | ||
| the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character. | ||
| Alternatively, a C string literal can be a _raw C string literal_, defined | ||
| below. The type of a C string literal is [`&core::ffi::CStr`][CStr]. |
There was a problem hiding this comment.
Whilst below it is mentioned that code point escapes are encoded as UTF-8, nowhere is it stated how the Unicode characters contained within the C string literal are encoded in the ensuing CStr: I presume also UTF-8? Perhaps this should be stated explicitly for the avoidance of any doubt.
There was a problem hiding this comment.
Done -- I added a section about encoding after the escapes.
2481014 to
c86fa19
Compare
…ilstrieb Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
c86fa19 to
ae1eb71
Compare
|
The stabilization PR has merged and this PR's CI build is now green. |
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
Rollup merge of rust-lang#118614 - rustbot:docs-update, r=ehuss Update books ## rust-lang/nomicon 1 commits in 1842257814919fa62e81bdecd5e8f95be2839dbb..83d015105e6d490fc30d6c95da1e56152a50e228 2023-11-22 15:35:31 UTC to 2023-11-22 15:35:31 UTC - Reword the section on general race conditions (rust-lang/nomicon#431) ## rust-lang/reference 5 commits in cd8193e972f61b92117095fc73b67af767b4d6bc..692d216f5a1151e8852ddb308ba64040e634c876 2023-12-04 09:45:06 UTC to 2023-11-21 17:57:18 UTC - Fix note on `self` coercion (rust-lang/reference#1431) - Document C string literal tokens. (rust-lang/reference#1423) - type-layout.md: Warn about repr(align)/repr(packed) and field order (rust-lang/reference#1430) - Lone `self` in a method body resolves to the self parameter (rust-lang/reference#1427) - Reference wildcard patterns from underscore expr (rust-lang/reference#1428) ## rust-lang/rust-by-example 4 commits in a6581246f96837113968c02187db24f742af3908..da0a06aada31a324ae84a9eaee344f6a944b9683 2023-11-27 12:50:49 UTC to 2023-11-21 11:58:19 UTC - fix tiny typo in string conversion docs (rust-lang/rust-by-example#1776) - fix(arg): Remove reference to Rust Cookbook in arg parsing (rust-lang/rust-by-example#1775) - fix:typo error (rust-lang/rust-by-example#1774) - Remove space between `&` and `self` (rust-lang/rust-by-example#1772) ## rust-lang/rustc-dev-guide 5 commits in ddb8b13..904bb5a 2023-11-28 13:13:36 UTC to 2023-11-22 06:13:00 UTC - Update how-to-build-and-run.md (rust-lang/rustc-dev-guide#1828) - notification groups: add information about how to ping them (rust-lang/rustc-dev-guide#1818) - Add explanations on how to run rustc_codegen_gcc tests (rust-lang/rustc-dev-guide#1821) - Add back the `canonicalization` chapter. (rust-lang/rustc-dev-guide#1532) - Emphasize that the experts map is not up to date (rust-lang/rustc-dev-guide#1826)
This reverts commit 21a27e1, reversing changes made to 01a12f2. This is being reverted in rust-lang/rust#119528
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
Stabilize C string literals RFC: https://rust-lang.github.io/rfcs/3348-c-str-literal.html Tracking issue: rust-lang/rust#105723 Documentation PR (reference manual): rust-lang/reference#1423 # Stabilization report Stabilizes C string and raw C string literals (`c"..."` and `cr#"..."#`), which are expressions of type [`&CStr`](https://doc.rust-lang.org/stable/core/ffi/struct.CStr.html). Both new literals require Rust edition 2021 or later. ```rust const HELLO: &core::ffi::CStr = c"Hello, world!"; ``` C strings may contain any byte other than `NUL` (`b'\x00'`), and their in-memory representation is guaranteed to end with `NUL`. ## Implementation Originally implemented by PR rust-lang/rust#108801, which was reverted due to unintentional changes to lexer behavior in Rust editions < 2021. The current implementation landed in PR rust-lang/rust#113476, which restricts C string literals to Rust edition >= 2021. ## Resolutions to open questions from the RFC * Adding C character literals (`c'.'`) of type `c_char` is not part of this feature. * Support for `c"..."` literals does not prevent `c'.'` literals from being added in the future. * C string literals should not be blocked on making `&CStr` a thin pointer. * It's possible to declare constant expressions of type `&'static CStr` in stable Rust (as of v1.59), so C string literals are not adding additional coupling on the internal representation of `CStr`. * The unstable `concat_bytes!` macro should not accept `c"..."` literals. * C strings have two equally valid `&[u8]` representations (with or without terminal `NUL`), so allowing them to be used in `concat_bytes!` would be ambiguous. * Adding a type to represent C strings containing valid UTF-8 is not part of this feature. * Support for a hypothetical `&Utf8CStr` may be explored in the future, should such a type be added to Rust.
No description provided.