Small tweaks in ToOwned::clone_into#70201
Conversation
|
r? @dtolnay (rust_highfive has picked a reviewer for you, use r? to override) |
|
To justify the slice optimization, here's output from compiler explorer. They are nearly the same, but note the smaller prologue and epilogue in the new example::clone_into_orig:
push r15
push r14
push r12
push rbx
push rax
mov r14, rdx
mov r12, rsi
mov r15, rdi
mov rbx, qword ptr [rdx + 16]
cmp rbx, rsi
jb .LBB1_2
mov qword ptr [r14 + 16], r12
mov rbx, r12
.LBB1_2:
test rbx, rbx
je .LBB1_4
mov rdi, qword ptr [r14]
lea rdx, [4*rbx]
mov rsi, r15
call qword ptr [rip + memcpy@GOTPCREL]
.LBB1_4:
lea rsi, [r15 + 4*rbx]
sub r12, rbx
mov rdi, r14
mov rdx, r12
add rsp, 8
pop rbx
pop r12
pop r14
pop r15
jmp alloc::vec::Vec<T>::extend_from_slice
example::clone_into_split:
push r15
push r14
push rbx
mov r14, rdx
mov rbx, rsi
mov rsi, rdi
mov rdx, qword ptr [rdx + 16]
cmp rdx, rbx
jb .LBB2_2
mov qword ptr [r14 + 16], rbx
mov rdx, rbx
.LBB2_2:
lea r15, [rsi + 4*rdx]
sub rbx, rdx
test rdx, rdx
je .LBB2_4
mov rdi, qword ptr [r14]
shl rdx, 2
call qword ptr [rip + memcpy@GOTPCREL]
.LBB2_4:
mov rdi, r14
mov rsi, r15
mov rdx, rbx
pop rbx
pop r14
pop r15
jmp alloc::vec::Vec<T>::extend_from_slice |
|
@bors try @rust-timer queue |
|
Awaiting bors try build completion |
|
⌛ Trying commit 28791f6 with merge c462c8500772945a0f68a265dc1f8decf6b42e99... |
|
☀️ Try build successful - checks-azure |
|
Queued c462c8500772945a0f68a265dc1f8decf6b42e99 with parent 1057dc9, future comparison URL. |
|
Finished benchmarking try commit c462c8500772945a0f68a265dc1f8decf6b42e99, comparison URL. |
|
Seems neutral to compiler performance. I think |
|
@bors r+ |
|
📌 Commit 28791f6 has been approved by |
Small tweaks in ToOwned::clone_into - `<[T]>::clone_into` is slightly more optimized. - `CStr::clone_into` is new, letting it reuse its allocation. - `OsStr::clone_into` now forwards to the underlying slice/`Vec`.
|
Failed in rollup #70237 (comment) @bors r- |
|
The new test in c_str.rs fails on Windows x86_64-msvc-1: ---- ffi::c_str::tests::test_c_str_clone_into stdout ----
thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
left: `0x210c5e05800`,
right: `0x210c5e05830`', src\libstd\ffi\c_str.rs:1527:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
failures:
ffi::c_str::tests::test_c_str_clone_into
test result: FAILED. 743 passed; 1 failed; 2 ignored; 0 measured; 0 filtered out |
|
Hmm, maybe that test is assuming too much about the allocator. I could try it with equal lengths, but I'll confirm on Windows before I resubmit. |
|
It could just be there is #[cfg(windows)] codepath that is missing a clone_into function. |
It appears to codegen slightly more efficiently with `split_at` taking two slices at once, rather than slicing across different calls.
It can try to keep its allocation by converting the inner `Box` to `Vec`, using `clone_into` on the bytes, then convert back to `Box`.
Despite OS differences, they're all just `Vec<u8>` inside, so we can just forward `clone_into` calls to that optimized implementation.
|
OK, I was able to confirm that failure on Windows, and now it passes when the strings are equal length. |
|
@bors r+ |
|
📌 Commit f854070 has been approved by |
Small tweaks in ToOwned::clone_into - `<[T]>::clone_into` is slightly more optimized. - `CStr::clone_into` is new, letting it reuse its allocation. - `OsStr::clone_into` now forwards to the underlying slice/`Vec`.
|
⌛ Testing commit f854070 with merge 83177ff73c809ab897cbc450cf19aee95d48f94a... |
|
@bors retry (yield) |
…ievink Rollup of 5 pull requests Successful merges: - rust-lang#70201 (Small tweaks in ToOwned::clone_into) - rust-lang#70762 (Miri leak check: memory reachable through globals is not leaked) - rust-lang#70846 (Keep codegen units unmerged when building compiler builtins) - rust-lang#70854 (Use assoc int submodules) - rust-lang#70857 (Don't import integer and float modules, use assoc consts 2) Failed merges: r? @ghost
<[T]>::clone_intois slightly more optimized.CStr::clone_intois new, letting it reuse its allocation.OsStr::clone_intonow forwards to the underlying slice/Vec.