-
Notifications
You must be signed in to change notification settings - Fork 1.7k
repr(ordered_fields) #3845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
repr(ordered_fields) #3845
Conversation
|
Not to add too many extra colours to the list, but (Note: those three things should cover every case I've seen that uses Whereas Also, while it may be more technical than most users need to understand, it would be helpful if the RFC reiterated the current issues with |
|
Just as a small point of style the Guide Level Explanation is usually "what would be written in the rust tutorial book", and the Reference Level Explanation is "what could be written into the Rust Reference". This isn't a strict requirement, but personally I'd like to see the Reference Level part written out. Using the present tense, as if the RFC was accepted and implemented. |
add more unresolved questions
Rework struct layout description.
text/3845-repr-ordered-fields.md
Outdated
| # Guide-level explanation | ||
| [guide-level-explanation]: #guide-level-explanation | ||
|
|
||
| `repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| `repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in memory layout. | |
| `repr(ordered_fields)` is a new representation that can be applied to `struct`, `enum`, and `union` to give them a consistent, cross-platform, and predictable in-memory layout. |
"cross-platform" -- the layout will differ when there are different layouts for struct members' types, in particular primitive types can have different alignments which changes the amount of padding.
e.g., #[repr(ordered_fields)] struct S(u8, f64); doesn't have the same layout on x86_64 and i686
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, this will need to be documented as a hazard in the ordered_fields docs. However, the repr itself will be cross-platform. For example, #[repr(ordered_fields)] struct Cross([u8; 3], SomeEnum); will be truly cross-platform (given that SomeEnum is!).
Co-authored-by: Jacob Lifshay <programmerjake@gmail.com>
Just voicing support for |
|
Nominating this so that we can do a preliminary vibe-check on it in a lang triage meeting. |
text/3845-repr-ordered-fields.md
Outdated
| Currently `repr(C)` serves two roles | ||
| 1. Provide a consistent, cross-platform, predictable layout for a given type | ||
| 2. Match the target C compiler's struct/union layout algorithm and ABI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Big fan of doing this split, especially for structs. (It's less obvious what choices to make for other things, IMHO, but at least for structs this is something I've wanted for ages, so that for example Layout::extend can talk about it instead of C.)
Pondering the bikeshed: declaration_order or something could also be used to directly say what you're getting.
(This could be contrasted with other potential reprs that I wouldn't expect this RFC to add, but could consider as future work, like a deterministic_by_size_and_alignment where some restricted set of optimizations are allowed but you can be sure that usize and NonNull<String> can be mixed between different types while still getting the "same" field offsets, for example.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is also useful for unions, so we don't need to rely on repr(C) to ensure that all fields of a union are at offset 0.
This could be contrasted with other potential reprs that I wouldn't expect this RFC to add...
This also works as an argument against names like repr(consistent), since there are multiple consistent and useful repr, making it not descriptive enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think that declaration_order or ordered_fields is a bit weird on a union, because of course they're not really in any "order".
It makes me ponder whether we should just have repr(offset_zero) for unions to be explicit about it, or something.
(Which makes me think of other things like addressing rust-lang/unsafe-code-guidelines#494 by having a different constructs for "bag of maybeuninit stuff that overlap" vs "distinct options with an active-variant rule for enum-but-without-stored-discriminant". But those are definitely not this RFC.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind spelling it as repr(offset_zero) for unions if that helps get this RFC accepted 😄. However, I have a sneaking suspicion that this isn't the contentious part of this RFC.
I know the name isn't optimal (intentionally). This can be hashed out after the RFC is accepted (or even give a different name for all of struct, union, and enum).
The most important bit for me is just that we do the split (for all of struct, union, and enum, to be consistent).
|
I've updated how enums's tags are specified, now they just defer to whatever |
text/3845-repr-ordered-fields.md
Outdated
|
|
||
| `repr(C)` in edition <= 2024 is an alias for `repr(ordered_fields)` and in all other editions, it matches the default C compiler for the given target for structs, unions, and field-less enums. Enums with fields will be laid out as if they are a union of structs with the corresponding fields. | ||
|
|
||
| Using `repr(C)` in editions <= 2024 triggers a lint to use `repr(ordered_fields)` as a future compatibility lint with a machine-applicable fix. If you are using `repr(C)` for FFI, then you may silence this lint. If you are using `repr(C)` for anything else, please switch over to `repr(ordered_fields)` so updating to future editions doesn't change the meaning of your code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is too noisy. Most code out there using repr(C) is probably fine - IIUC, if you're not targeting Windows or AIX, maybe definitely fine? - and having a bunch of allow(...) across a bunch of projects seems unfortunate.
Maybe we can either (a) only enable the lint for migration, i.e., the next edition's cargo fix would add allows for you or (b) we find some new name... C2 for the existing repr(C) usage to avoid allows. But (b) also seems too noisy to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it could just be an optional edition compatibility lint, so if someone enables e.g. rust_20xx_compatibility it shows up but otherwise not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(a) only enable the lint for migration
That was the intention, hence the name edition_2024_repr_c. I'll make this more clear, that this is intended to be a migration lint.
Rustfix would update to #[repr(ordered_fields)] to preserve the current behavior. For the FFI crates, #![allow(edition_2024_repr_c)] at the top of lib.rs would suffice. If you have a mix of FFI and non-FFI uses of repr(C), then you'll have to do the work to figure out which is which, no matter what option is chosen to update repr(C) - even adding repr(C2), since then the FFI use case would need to update all their reprs to repr(C2).
Overall, I think this scheme only significantly burdens those who have a mix of FFI and non-FFI uses of repr(C). But they were going to be burdened no matter what option was chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the new wording/lints too noisy still?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new wording is still too noisy. We shouldn't assume that most people using repr(C) are using it for ordering rather than FFI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't my intention, but I don't see another way to do all of the following in the next edition:
- make
repr(C)mean - same layout/ABI as what the standard C compiler does - make
repr(ordered_fields)- the same algorithm that's listed forrepr(C)in the Rust reference - ensure that everyone who upgrades to the next edition gets the layout they need (as long as they read the warnings and follow the given advice)
- make it as painless as possible for people who don't mix FFI and stable ordering cases (which I suspect is the vast majority of people). In other words, each crate currently uses
repr(C)either exclusively for FFI or exclusively for some stable layout. - for people who do mix FFI and stable ordering cases in one crate, at least the warning should give them all the places they need to double-check, and they can silence the warning on a case-by-case basis.
I'm open to suggestions on how to handle the diagnostics. Within these constraints, I think my solution is the only real option we have. If there are some objections to these constraints, I would like to hear those too, maybe I missed the mark with these constraints, and missed a potential solution because of it.
Co-authored-by: Ralf Jung <post@ralfj.de>
This post and its neighbors discuss some of that. |
…enkov repr(transparent): do not consider repr(C) types to be 1-ZST Context: rust-lang/unsafe-code-guidelines#552 This experiments with a [suggestion](rust-lang/rfcs#3845 (comment)) by `@RustyYato` to stop considering repr(C) types as 1-ZST for the purpose of repr(transparent). If we go with rust-lang/rfcs#3845 (or another approach for fixing repr(C)), they will anyway not be ZST on all targets any more, so this removes a portability hazard. Furthermore, zero-sized repr(C) structs [may have to be treated](rust-lang/unsafe-code-guidelines#552 (comment)) as non-ZST for the win64 ABI (at least that's what gcc/clang do), so allowing them to be ignored in repr(transparent) types is not entirely coherent. Turns out we already have an FCW for repr(transparent), namely rust-lang#78586. This extends that lint to also check for repr(C).
…enkov repr(transparent): do not consider repr(C) types to be 1-ZST Context: rust-lang/unsafe-code-guidelines#552 This experiments with a [suggestion](rust-lang/rfcs#3845 (comment)) by ``@RustyYato`` to stop considering repr(C) types as 1-ZST for the purpose of repr(transparent). If we go with rust-lang/rfcs#3845 (or another approach for fixing repr(C)), they will anyway not be ZST on all targets any more, so this removes a portability hazard. Furthermore, zero-sized repr(C) structs [may have to be treated](rust-lang/unsafe-code-guidelines#552 (comment)) as non-ZST for the win64 ABI (at least that's what gcc/clang do), so allowing them to be ignored in repr(transparent) types is not entirely coherent. Turns out we already have an FCW for repr(transparent), namely rust-lang#78586. This extends that lint to also check for repr(C).
…enkov repr(transparent): do not consider repr(C) types to be 1-ZST Context: rust-lang/unsafe-code-guidelines#552 This experiments with a [suggestion](rust-lang/rfcs#3845 (comment)) by ```@RustyYato``` to stop considering repr(C) types as 1-ZST for the purpose of repr(transparent). If we go with rust-lang/rfcs#3845 (or another approach for fixing repr(C)), they will anyway not be ZST on all targets any more, so this removes a portability hazard. Furthermore, zero-sized repr(C) structs [may have to be treated](rust-lang/unsafe-code-guidelines#552 (comment)) as non-ZST for the win64 ABI (at least that's what gcc/clang do), so allowing them to be ignored in repr(transparent) types is not entirely coherent. Turns out we already have an FCW for repr(transparent), namely rust-lang#78586. This extends that lint to also check for repr(C).
Rollup merge of #147185 - RalfJung:repr-c-not-zst, r=petrochenkov repr(transparent): do not consider repr(C) types to be 1-ZST Context: rust-lang/unsafe-code-guidelines#552 This experiments with a [suggestion](rust-lang/rfcs#3845 (comment)) by ```@RustyYato``` to stop considering repr(C) types as 1-ZST for the purpose of repr(transparent). If we go with rust-lang/rfcs#3845 (or another approach for fixing repr(C)), they will anyway not be ZST on all targets any more, so this removes a portability hazard. Furthermore, zero-sized repr(C) structs [may have to be treated](rust-lang/unsafe-code-guidelines#552 (comment)) as non-ZST for the win64 ABI (at least that's what gcc/clang do), so allowing them to be ignored in repr(transparent) types is not entirely coherent. Turns out we already have an FCW for repr(transparent), namely #78586. This extends that lint to also check for repr(C).
repr(transparent): do not consider repr(C) types to be 1-ZST Context: rust-lang/unsafe-code-guidelines#552 This experiments with a [suggestion](rust-lang/rfcs#3845 (comment)) by ```@RustyYato``` to stop considering repr(C) types as 1-ZST for the purpose of repr(transparent). If we go with rust-lang/rfcs#3845 (or another approach for fixing repr(C)), they will anyway not be ZST on all targets any more, so this removes a portability hazard. Furthermore, zero-sized repr(C) structs [may have to be treated](rust-lang/unsafe-code-guidelines#552 (comment)) as non-ZST for the win64 ABI (at least that's what gcc/clang do), so allowing them to be ignored in repr(transparent) types is not entirely coherent. Turns out we already have an FCW for repr(transparent), namely rust-lang/rust#78586. This extends that lint to also check for repr(C).
|
@RustyYato So, an chance you could incorporate that kind of motivation into the RFC? Also let us know if you'd like help with editing the RFC. |
clarify that we are describing the behavior of repr(ordered_fields) clarify differences with `repr(iN)`
teor2345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some typo fixes, and an idea for a repr name
Co-authored-by: teor <teor@riseup.net> Co-authored-by: Jacob Lifshay <programmerjake@gmail.com>
Co-authored-by: Jules Bertholet <jules.bertholet@gmail.com>
| * This is only necessary for those who are updating to the new edition. Which is as little churn as we can make it | ||
| * If we don't end up switching `repr(C)` to mean the system layout/ABI, then we will have two identical reprs, which may cause confusion. | ||
|
|
||
| # Rationale and alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues that this fixes are with ZSTs over FFI (basically never done), an AIX bug (tier 3 target), and might help with an x86_32 MSVC bug. That seems... Kinda weak IMO, compared to like a million #[repr(C)]s out there that currently work.
Maybe instead of planning to change #[repr(C)], an alternative could be to introduce both #[repr(ordered_fields)] and #[repr(bikeshed_broken_but_actual_C_struct_repr)], and have #[repr(C)] guaranteed to be an alias of #[repr(ordered_fields)]? And then we deprecate #[repr(C)] once those two are in everyone's MSRV and such.
This would avoid the hard issues of "when do we break this" and "oops, things act differently in edition 2027". Having previously fine unsafe code become unsound over an edition is dangerous territory.
Building on top of that, another alternative would be to not deprecate at all, and instead leave #[repr(C)] as an alias of #[repr(ordered_fields)]. (#[repr(C)] is what the C standard says, it's just certain compilers that don't really follow the standard).
And then we instead introduce more specific checks like "type is #[repr(C)] + type is ZST or has 8-byte fields + type is used in FFI" -> "warn and suggest #[repr(bikeshed_broken_but_actual_C_struct_repr)]".
This would allow #[repr(C)] struct Point { x: c_int, y: c_int } and such to continue working just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the problem of aligned structs inside packed structs.
(#[repr(C)] is what the C standard says, it's just certain compilers that don't really follow the standard).
I don't think this is correct. The C standard says nothing about struct layout. Rust just made incorrect assumptions. We should fix that mistake instead of doubling down on it.
And then we instead introduce more specific checks like "type is #[repr(C)] + type is ZST or has 8-byte fields + type is used in FFI" -> "warn and suggest #[repr(bikeshed_broken_but_actual_C_struct_repr)]".
Lints won't work when generics are involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would avoid the hard issues of "when do we break this" and "oops, things act differently in edition 2027". Having previously fine unsafe code become unsound over an edition is dangerous territory.
So should edition migration change repr(C) in 2024 to repr(ordered_fields) in 202x to ensure the meaning is preserved?
Note that for the same reason that most repr(C) out there work, also most unsafe code working with repr(C) won't become unsound.
Or maybe yet another option: we do introduce repr(psABI) for "the (C) layout defined by the psABI of the current target". And then for all repr(C) we check whether for this type, repr(ordered_fields) and repr(psABI) agree for all targets (or just for the current target?). If yes, all is good, if not
- on old editions: warn and continue with repr(ordered_fields)
- on new editions: error
Note that pretty much every repr(C) struct with a generic field would trigger the warning/error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also the problem of aligned structs inside packed structs.
Sorry, didn't fully grok that issue, so I assigned it a lower priority in my head. I'm not certain here, but I don't think that's incompatible with the alternatives I proposed?
I don't think this is correct. The C standard says nothing about struct layout. Rust just made incorrect assumptions. We should fix that mistake instead of doubling down on it.
Perhaps. In any case, I'm not convinced that there isn't a bunch of C code that is broken by this as well because they make the same assumption Rust does, and thus the argument that these are bugs in the C compilers instead.
Lints won't work when generics are involved.
The lint would probably have to happen on the actual usage in extern "C" fns (either declarations or definitions) a la improper_ctypes*, and generics are fairly rare in those places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
, I'm not convinced that there isn't a bunch of C code that is broken by this as well because they make the same assumption Rust does, and thus the argument that these are bugs in the C compilers instead.
IMO it's fairly clear that such code is buggy or at least non-portable C code. I see no basis for arguing that these C compilers are wrong. They are just strange.
The lint would probably have to happen on the actual usage in extern "C" fns (either declarations or definitions) a la improper_ctypes*, and generics are fairly rare in those places.
That's best-effort since not all types that cross the ABI boundary might be visible there (passing around void* and casting later is a common idiom after all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe yet another option: we do introduce
repr(psABI)[...]
I like this idea a lot more than the current RFC.
Also unsure how strict such a check should be, maybe:
- If affected platform:
- on old editions: warn and continue with repr(ordered_fields)
- on new editions: error
- If not affected platform:
- on old editions: continue with repr(ordered_fields)
- on new editions: warn and continue with repr(ordered_fields)
That'd probably make the change more gradual.
IMO it's fairly clear that such code is buggy or at least non-portable C code. I see no basis for arguing that these C compilers are wrong. They are just strange.
I'll trust you to know better on that subject, I haven't actually read the C standard or anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo repr(psABI) is a terrible name, since that's way more obscure than C, I wouldn't expect even decent C and Rust programmers to know what it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only have two options:
- Change what
repr(C)means. Can be done over an edition, but will still break copying any old material. - Consider
repr(C)"burnt" at least for all cases where the C layout is different fromrepr(ordered_fields), and come up with a different name, which will be terrible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I saw someone propose repr(system) somewhere, but I don’t remember who exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was meant to have a slightly different meaning, just like extern "systems" (which already exists) is not the same as extern "C".
| * match layouts of two different types (or even, two different monomorphizations of the same generic type) | ||
| * see [here](https://github.com/rust-lang/rust/pull/68099), where in `alloc` this is done for `Rc` and `Arc` to give a consistent layout for all `T` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This last kind of use isn't actually broken by "just fixing repr(C)", right? Layout is still just a function of the field types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's not broken on it's own. But changing the layout of these types is a little suspicious, since usually if you want them to have a consistent layout, then you are also expoiting that fact somewhere else.
edit motivation section
Add
repr(ordered_fields)and provide a migration path to switch users fromrepr(C)torepr(ordered_fields), then change the meaning ofrepr(C)in the next edition.This RFC is meant to be an MVP, and any extensions (for example, adding more
reprs) are not in scope. This is done to make it as easy as possible to accept this RFC and make progress on the issue ofrepr(C)serving two opposing roles.Rendered
To avoid endless bikeshedding, I'll make a poll if this RFC is accepted with all the potential names for the new
repr. If you have a new name, I'll add it to the list of names in the unresolved questions section, and will include it in the poll.