Fix variable deallocation order in panic unwinding paths#149435
Fix variable deallocation order in panic unwinding paths#149435sladyn98 wants to merge 6 commits intorust-lang:mainfrom
Conversation
|
r? @wesleywiser rustbot has assigned @wesleywiser. Use |
|
r? @dianne |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
It looks like there's a bug causing an assertion failure when building the standard library. I've given it a look and offered a guess at what's causing it below. There's still work to do here beyond fixing that, though.
First, could you add a ui test to demonstrate that this fixes #147875? It looks like it might not yet, since the code for scheduling unwind drops on calls panicking looks unchanged.
Second, after verifying that this results in the correct borrow-checking behavior, we need to make sure that this change doesn't negatively affect codegen. Per the old comment on needs_cleanup, at least at the time it was written, LLVM didn't handle the unnecessary cleanup blocks and StorageDeads particularly well. If you can demonstrate with codegen tests that that's not an issue anymore, and perf isn't too bad, that might be all that's needed. But my expectation is that we'll have to get rid of or ignore the StorageDeads later in compilation (sometime after they serve their purpose in borrowck). Unless there's a reason to keep the StorageDeads around longer, my gut feeling is that this cleanup would be best as a post-borrowck MIR pass (maybe as part of CleanupPostBorrowck?), since then optimization passes can be done on cleaner MIR and we can test it works with MIR tests rather than codegen tests. Could you also add a test for this not affecting later stages of compilation? If you accomplish that by removing the unwind-path StorageDeads as part of a MIR pass, that'd be a mir-opt test.
Before you push again, you'll probably want to run the codegen and mir-opt tests to make sure the former is clean and to bless the latter. Regardless of what approach we take here, if we're changing how the MIR is built, there should be differences in the MIR building test output (part of the mir-opt suite).
| fn needs_cleanup(&self) -> bool { | ||
| self.drops.iter().any(|drop| match drop.kind { | ||
| DropKind::Value | DropKind::ForLint => true, | ||
| DropKind::Storage => false, | ||
| }) | ||
| !self.drops.is_empty() |
There was a problem hiding this comment.
Could you please explain this change? My understanding at least is that the ordering of StorageDeads only matters relative to actual drops, since that's when it can affect borrow-checking; reordering StorageDeads amongst each other won't do anything, but a StorageDead before a drop terminator will cause a borrow-checking failure if the destructor could reference the dead memory. As such, I'd expect we'd only need a cleanup block when there's actual drops (which is what the old version of this was checking for). Is there an edge case where we'd need a cleanup block with only StorageDeads in it? Otherwise, could you reinstate the comment about avoiding creating landing pads when there's no actual destructors?
That said, from what I can tell this method is only used for determining whether cleanup blocks are required for unwinding from panics in destructors, so could you make sure there's a MIR test checking that we don't create unnecessary cleanup blocks in other cases too, particularly for calls?
Also, if the comment about LLVM is still true and we need to get rid of the StorageDeads before codegen, we should probably keep some updated version of those comments around.
|
Reminder, once the PR becomes ready for a review, use |
|
Also, could you change the PR description? #147875 on its own doesn't allow destructors to access freed memory, it doesn't allow for the creation of dangling references, and I'm at least not aware of a safety guarantee that it violates. You should only get unsoundness out of it if you write unsafe code on the assumption that the borrow checker will enforce the relative drop order of locals that may have destructors and those that definitely don't. Of course, per language team decision, consistent drop order is a promise Rust would like to make. But it's not quite the same as the borrow-checker failing to ensure places outlive their references. |
|
So what i did was write this simple rust program panic drop.rs I ran the llvm to get the intermediate representaion and on looking at the IR I cannot find any llvm.lifetime.end statements suggesting to us that on master the StorageDead statements are missing, which according to my understanding means that the borrowchecker does not know when the storage becomes invalid. Let me now write the UI test to see what is up |
|
edit: adjusted wording |
5afe7c2 to
59a7e56
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
This still needs CI to pass before I can review it properly. I've left a few comments on obvious things, but I don't think reviewing the code changes would be helpful at this point. Please test your changes locally. You don't have to run the whole test suite yourself, but for this change, you'll at least want make sure that the mir-opt and codegen tests all pass, that any relevant ui tests pass, and that tidy passes as well.
Could you rebase onto a more recent commit, also? I don't expect there will be conflicts in the MIR building part of this, but I'm not sure about the rest.
I don't mean to be harsh, but this is a relatively complex and nuanced change. If you're not familiar with what's being changed, why it's being changed, the consequences/needs of that, and general contribution procedure, I'd recommend gaining familiarity with easier issues instead.
59a7e56 to
44fbdb3
Compare
|
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
8df38cd to
0f688eb
Compare
This comment has been minimized.
This comment has been minimized.
|
I think i got a good grasp of the problem and how we want to solve it, i started by adding StorageDead to unwind paths in both I also needed to make sure StorageDead gets removed after borrow-checking so it doesn't affect codegen, so i then went modified I also had to figure out the right place to remove StorageDead, you suggested a post-borrowck MIR pass, so I added it to Now StorageDead is emitted on unwind paths for all functions (not just coroutines), which makes the borrow-checker stricter and more consistent. The borrow-checker now treats variables as dead at the same point on all paths, which is exactly what #147875 needed. And StorageDead is properly removed from cleanup blocks after borrow-checking, so it doesn't affect codegen. Everything else is implemented and tested. The main question is whether the comments need more precision about where StorageDead gets removed. |
1363caa to
92d28c2
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
rust-analyzer is developed in its own repository. If possible, consider making this change to rust-lang/rust-analyzer instead. cc @rust-lang/rust-analyzer |
This comment has been minimized.
This comment has been minimized.
This commit fixes several issues related to StorageDead and ForLint drops:
1. Add StorageDead and ForLint drops to unwind_drops for all functions
- Updated diverge_cleanup_target to include StorageDead and ForLint drops
in the unwind_drops tree for all functions (not just coroutines), but only
when there's a cleanup path (i.e., when there are Value or ForLint drops)
- This ensures proper drop ordering for borrow-checking on panic paths
2. Fix break_for_tail_call to handle StorageDead and ForLint drops
- Don't skip StorageDead drops for non-drop types
- Adjust unwind_to pointer for StorageDead and ForLint drops, matching
the behavior in build_scope_drops
- Only adjust unwind_to when it's valid (not DropIdx::MAX)
- This prevents debug assert failures when processing drops in tail calls
3. Fix index out of bounds panic when unwind_to is DropIdx::MAX
- Added checks to ensure unwind_to != DropIdx::MAX before accessing
unwind_drops.drop_nodes[unwind_to]
- Only emit StorageDead on unwind paths when there's actually an unwind path
- Only add entry points to unwind_drops when unwind_to is valid
- This prevents panics when there's no cleanup needed
4. Add test for explicit tail calls with StorageDead drops
- Tests that tail calls work correctly when StorageDead and ForLint drops
are present in the unwind path
- Verifies that unwind_to is correctly adjusted for all drop kinds
These changes make the borrow-checker stricter and more consistent by ensuring
that StorageDead statements are emitted on unwind paths for all functions when
there's a cleanup path, allowing unsafe code to rely on drop order being enforced
consistently.
- Add StorageDead to unwind paths for all functions (not just coroutines) - Modify CleanupPostBorrowck to remove StorageDead from cleanup blocks - Add tests for the fix and StorageDead removal
When processing drops in reverse order, unwind_to might not point to the current drop. Only adjust unwind_to when the drop matches what unwind_to is pointing to, rather than asserting they must match.
e94d041 to
67a9bdb
Compare
|
This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed. Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers. |
This comment has been minimized.
This comment has been minimized.
| //! - [`StorageDead`] statements (these are only needed for borrow-checking and are removed | ||
| //! after borrowck completes to ensure they don't affect later optimization passes or codegen) |
There was a problem hiding this comment.
This isn't true in general. Being more specific about which StorageDeads are removed may be helpful.
| // Only remove StorageDead from cleanup blocks (unwind paths). | ||
| // StorageDead on normal paths is still needed for MIR validation | ||
| // and will be removed later by RemoveStorageMarkers during optimization. |
There was a problem hiding this comment.
I don't think this is true. RemoveStorageMarkers removes storage markers specifically when we don't want them to be emitted by codegen. It's only enabled when sess.mir_opt_level() > 0 && !sess.emit_lifetime_markers(). Storage markers can be used by codegen backends for their analyses as well. If we weren't emitting them during codegen ever, we wouldn't need to bother removing them here; we could just wait until RemoveStorageMarkers.
| // Only remove StorageDead from cleanup blocks (unwind paths). | ||
| // StorageDead on normal paths is still needed for MIR validation | ||
| // and will be removed later by RemoveStorageMarkers during optimization. | ||
| let is_cleanup = basic_block.is_cleanup; |
There was a problem hiding this comment.
Nit: is there a reason this needs to be assigned to a variable outside of the loop?
| // These StorageDead statements are removed by the `RemoveStorageMarkers` MIR | ||
| // transform pass before codegen, so they don't affect LLVM output. |
There was a problem hiding this comment.
| // These StorageDead statements are removed by the `RemoveStorageMarkers` MIR | |
| // transform pass before codegen, so they don't affect LLVM output. | |
| // These StorageDead statements are removed by the `CleanupPostBorrowck` MIR | |
| // transform pass, so they don't affect codegen. |
| // are Value or ForLint drops present, because: | ||
| // 1. StorageDead is only relevant for borrow-checking when there are destructors | ||
| // that might reference the dead variable | ||
| // 2. If there are no drops, there's no unwind path to emit StorageDead on |
There was a problem hiding this comment.
This may may be a bit misleading. We still unwind when there's no drops. We just don't need a cleanup block when there's no destructors to run before popping the stack frame.
| debug_assert_eq!(unwind_drops.drop_nodes[unwind_to].data.local, drop_data.local); | ||
| debug_assert_eq!(unwind_drops.drop_nodes[unwind_to].data.kind, drop_data.kind); | ||
| unwind_to = unwind_drops.drop_nodes[unwind_to].next; | ||
| if unwind_to != DropIdx::MAX { |
There was a problem hiding this comment.
What's this check against DropIdx::MAX for? It doesn't seem to have been necessary previously, and if there's a DropKind::Value drop, we really should have provided a proper unwind_to. Plus if a check is needed, I really think it should be something self-evident or at least commented; checking for DropIdx::MAX feels magical and non-obvious.
| } | ||
|
|
||
| unwind_drops.add_entry_point(block, unwind_to); | ||
| if unwind_to != DropIdx::MAX { |
There was a problem hiding this comment.
As above, what's this check for?
| // Only adjust if the drop matches what unwind_to is pointing to (since we process | ||
| // drops in reverse order, unwind_to might not match the current drop). | ||
| if storage_dead_on_unwind | ||
| && unwind_to != DropIdx::MAX | ||
| && unwind_drops.drop_nodes[unwind_to].data.local == drop_data.local | ||
| && unwind_drops.drop_nodes[unwind_to].data.kind == drop_data.kind | ||
| { |
There was a problem hiding this comment.
This feels suspicious. I think this would need a lot more justification to turn those asserts into conditions, so I have a feeling something is wrong and this is papering over it.
| // Only adjust if the drop matches what unwind_to is pointing to (since we process | ||
| // drops in reverse order, unwind_to might not match the current drop). | ||
| if storage_dead_on_unwind | ||
| && unwind_to != DropIdx::MAX | ||
| && unwind_drops.drop_nodes[unwind_to].data.local == drop_data.local | ||
| && unwind_drops.drop_nodes[unwind_to].data.kind == drop_data.kind | ||
| { |
There was a problem hiding this comment.
As above, this feels suspicious. I think this would need a lot more justification to turn those asserts into conditions, so I have a feeling something is wrong and this is papering over it.
| let unwind_drop = self | ||
| .scopes | ||
| .unwind_drops | ||
| .add_drop(drop_node.data, unwind_indices[drop_node.next]); | ||
| unwind_indices.push(unwind_drop); |
There was a problem hiding this comment.
I don't think this should be unconditional. We only want to add StorageDeads if there's a real (or for-fcw) drop, and maybe ideally only before the last of those (if it's not too complicated). Not sure, but this might be what's causing the problem building r-a, since that involves breaking from a loop.
There was a problem hiding this comment.
Ideally there should be some way to write all of this without requiring duplicated logic all over the place. It feels really fragile to have to know in several different parts of that file exactly when unwind drops should be added.
There was a problem hiding this comment.
Another option to keep it simple would be to always include StorageDeads when building the MIR, then get rid of empty cleanup paths in a post-borrowck pass. I'm not sure how much worse for perf that would be, but we could try profiling it.
Fix lifetime issues in rust-analyzer where automaton doesn't live long enough for op.union(). Move op declaration inside each match arm to ensure proper lifetime scope. This fixes compilation errors that are blocking CI, though these are pre-existing issues unrelated to the StorageDead changes.
This comment has been minimized.
This comment has been minimized.
When processing drops in reverse order, unwind_to might not point to the current drop. Make the unwind_to adjustment conditional on the drop matching, matching the behavior in build_scope_drops. This prevents assertion failures when unwind_to points to a different drop than the one being processed.
|
The job Click to see the possible cause of the failure (guessed by this bot) |
There was a problem hiding this comment.
From your commit message:
This fixes compilation errors that are blocking CI, though these are
pre-existing issues unrelated to the StorageDead changes.
I guess this breakage comes from this PR rather than pre-existing, as this isn't failing in other PRs?
|
☔ The latest upstream changes (presumably #152035) made this pull request unmergeable. Please resolve the merge conflicts. |
This PR fixes a soundness bug where local variables are deallocated out of order during panic unwinding, allowing destructors to access freed memory. This violates Rust's safety guarantees and has caused real-world unsoundness in crates like generatively.
This PR removes the is_generator check and unconditionally emits StorageDead statements during unwinding for ALL functions, bringing non-generator behavior in line with generators. It ensures that during unwinding, when a local variable goes out of scope, its storage is properly marked as dead via StorageDead, allowing the borrow checker to enforce the
invariant that values must outlive their references even in panic paths.
Fixes #147875