-
Notifications
You must be signed in to change notification settings - Fork 114
refactor(fs): refactor path traversal handling #2172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
19bd9ce to
aef532d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark Results
Details
| Benchmark | Current: b1dec3a | Previous: af77eb8 | Performance Ratio |
|---|---|---|---|
| startup_benchmark Build Time | 100.26 s |
96.76 s |
1.04 ❗ |
| startup_benchmark File Size | 0.86 MB |
0.82 MB |
1.06 ❗ |
| Startup Time - 1 core | 0.93 s (±0.03 s) |
0.93 s (±0.03 s) |
1.00 |
| Startup Time - 2 cores | 0.95 s (±0.03 s) |
0.93 s (±0.03 s) |
1.02 |
| Startup Time - 4 cores | 0.96 s (±0.03 s) |
0.95 s (±0.04 s) |
1.02 |
| multithreaded_benchmark Build Time | 97.67 s |
94.96 s |
1.03 ❗ |
| multithreaded_benchmark File Size | 0.94 MB |
0.96 MB |
0.98 ❗ |
| Multithreaded Pi Efficiency - 2 Threads | 88.56 % (±6.97 %) |
88.34 % (±8.83 %) |
1.00 |
| Multithreaded Pi Efficiency - 4 Threads | 43.85 % (±3.68 %) |
43.96 % (±2.87 %) |
1.00 |
| Multithreaded Pi Efficiency - 8 Threads | 25.13 % (±1.88 %) |
25.67 % (±1.88 %) |
0.98 |
| micro_benchmarks Build Time | 107.12 s |
107.03 s |
1.00 ❗ |
| micro_benchmarks File Size | 0.95 MB |
0.97 MB |
0.98 ❗ |
| Scheduling time - 1 thread | 67.68 ticks (±2.22 ticks) |
68.40 ticks (±2.80 ticks) |
0.99 |
| Scheduling time - 2 threads | 37.00 ticks (±4.58 ticks) |
36.75 ticks (±3.77 ticks) |
1.01 |
| Micro - Time for syscall (getpid) | 2.86 ticks (±0.23 ticks) |
3.72 ticks (±0.30 ticks) |
0.77 ❗ |
| Memcpy speed - (built_in) block size 4096 | 66828.35 MByte/s (±47877.34 MByte/s) |
65936.06 MByte/s (±46962.77 MByte/s) |
1.01 |
| Memcpy speed - (built_in) block size 1048576 | 29708.25 MByte/s (±24486.41 MByte/s) |
29711.39 MByte/s (±24468.80 MByte/s) |
1.00 |
| Memcpy speed - (built_in) block size 16777216 | 28205.93 MByte/s (±23478.11 MByte/s) |
28572.31 MByte/s (±23801.85 MByte/s) |
0.99 |
| Memset speed - (built_in) block size 4096 | 67725.05 MByte/s (±48504.76 MByte/s) |
66224.66 MByte/s (±47148.82 MByte/s) |
1.02 |
| Memset speed - (built_in) block size 1048576 | 30499.28 MByte/s (±24944.92 MByte/s) |
30455.87 MByte/s (±24906.53 MByte/s) |
1.00 |
| Memset speed - (built_in) block size 16777216 | 28995.09 MByte/s (±23937.42 MByte/s) |
29305.64 MByte/s (±24204.32 MByte/s) |
0.99 |
| Memcpy speed - (rust) block size 4096 | 61019.62 MByte/s (±44977.96 MByte/s) |
59811.41 MByte/s (±43763.94 MByte/s) |
1.02 |
| Memcpy speed - (rust) block size 1048576 | 29566.78 MByte/s (±24444.76 MByte/s) |
29629.02 MByte/s (±24504.10 MByte/s) |
1.00 |
| Memcpy speed - (rust) block size 16777216 | 28204.62 MByte/s (±23468.87 MByte/s) |
28401.51 MByte/s (±23669.15 MByte/s) |
0.99 |
| Memset speed - (rust) block size 4096 | 62083.81 MByte/s (±45708.12 MByte/s) |
60851.37 MByte/s (±44540.62 MByte/s) |
1.02 |
| Memset speed - (rust) block size 1048576 | 30351.75 MByte/s (±24896.14 MByte/s) |
30426.48 MByte/s (±24959.19 MByte/s) |
1.00 |
| Memset speed - (rust) block size 16777216 | 28986.80 MByte/s (±23926.95 MByte/s) |
29175.93 MByte/s (±24111.52 MByte/s) |
0.99 |
| alloc_benchmarks Build Time | 103.86 s |
102.72 s |
1.01 ❗ |
| alloc_benchmarks File Size | 0.94 MB |
0.89 MB |
1.05 ❗ |
| Allocations - Allocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Deallocation success | 100.00 % |
100.00 % |
1 |
| Allocations - Pre-fail Allocations | 100.00 % |
100.00 % |
1 |
| Allocations - Average Allocation time | 10955.66 Ticks (±172.18 Ticks) |
8547.70 Ticks (±113.80 Ticks) |
1.28 ❗ |
| Allocations - Average Allocation time (no fail) | 10955.66 Ticks (±172.18 Ticks) |
8547.70 Ticks (±113.80 Ticks) |
1.28 ❗ |
| Allocations - Average Deallocation time | 1097.60 Ticks (±709.84 Ticks) |
1117.79 Ticks (±513.75 Ticks) |
0.98 |
| mutex_benchmark Build Time | 104.58 s |
111.82 s |
0.94 ❗ |
| mutex_benchmark File Size | 0.95 MB |
0.97 MB |
0.98 ❗ |
| Mutex Stress Test Average Time per Iteration - 1 Threads | 12.72 ns (±0.72 ns) |
13.38 ns (±0.85 ns) |
0.95 |
| Mutex Stress Test Average Time per Iteration - 2 Threads | 15.50 ns (±0.67 ns) |
15.68 ns (±0.79 ns) |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
ddbba05 to
6c0b8d6
Compare
| prefix: Option<String>, | ||
| attr: FileAttr, | ||
| original_prefix: Arc<str>, | ||
| prefix: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a good idea to instead only store the part in prefix that is actually "beyond" original_prefix.
ba1d3f4 to
34a1489
Compare
| let mut prefix = self.prefix.clone(); | ||
| // this part prevents inserting double-slashes or no slashes between prefix and path | ||
| if !path.is_empty() { | ||
| if let Some(x) = path.strip_prefix("/") { | ||
| path = x; | ||
| } else { | ||
| return Err(Errno::Nosys); | ||
| } | ||
| if !prefix.is_empty() { | ||
| prefix.push('/'); | ||
| } | ||
| } | ||
| prefix.push_str(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is new, and should be specifically reviewed.
This reverts commit 5772705.
Co-authored-by: Martin Kröning <martin.kroening@eonerc.rwth-aachen.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! :)
PRs are very difficult to review when one commit changes so much at once. I amended two commits to make discussing the main change of this more reviewable. I also extracted a few changes into another PR (#2229) to merge them separately. Small commits not only help the reviewer with comprehending the diff but also help Git show nicer diffs, since each diff has more anchor points that did not change.
I'll try to summarize the change to discuss it. Please add this summary or a corrected version if I got something wrong to the PR description. This helps immensely with understanding the changes and their motivation and is a necessary foundation for discussing the overall approach. Any alternatives that have been considered before opening the PR are also important to put into the PR description. Ultimately, PR authors should do a self-review before submitting a PR to ensure reviewability.
Extracting the core change, understanding it, and guessing the motivation took me a lot of time. For you, providing this info would have been much easier, since you know the ins and outs of the changes and your motivation. Please make sure to include these things in future PRs. Otherwise they will not get reviewed or merged.
Something like this should go into the PR description:
PR Description
Currently, VfsNode has many functions that do recursive traversal as well as their respective operation:
traverse_mkdirtraverse_rmdirtraverse_unlinktraverse_readdirtraverse_lstattraverse_stattraverse_mounttraverse_opentraverse_create_file
This PR makes these functions more maintainable by separating VFS traversal from the actual operations, heavily deduplicating path traversal logic.
This is done by adding three new methods to VfsNode:
-
fn traverse_once(&self, component: &str) -> io::Result<Box<dyn VfsNode>>;
-
fn traverse_multiple(&self, component: &str) -> io::Result<Box<dyn VfsNode>>;
-
fn dup(&self) -> Box<dyn VfsNode>;
dup clones the object like Clone::clone but returns Box<dyn VfsNode> instead of Self to remain dyn-compatible.
dup is required since subnodes may be a different dyn VfsNode that we return by value in the traversal functions.
This allows us for each operation to first traverse the VFS and then do the operation separately, successfully decoupling the two. Thus, this PR correspondingly changes all operation methods in the following way:
fn traverse_operation(&self, components: &mut Vec<&str>) -> io::Result<ReturnValue>becomes
fn operation(&self, component: &str) -> io::Result<ReturnValue>PR Review
Is dup not the same as Clone::clone in all current implementations? Why invent a new name? If this really is necessary, using the dyn_clone crate would be preferable.
More generally, I would like to avoid the reallocations and not have a dup method or use dyn_clone. This means not returning something via ownership but working on references.
What about still doing traversal recursively like before but also decoupling traversal from the operations?
Ideally, this would look something like this:
fn traverse_with<F, R>(&self, path: &str, f: F) -> io::Result<R>
where
F: FnOnce(&dyn VfsNode) -> R;Unfortunately, this would not be dyn-safe. This could be viable in the future, though if we chose to abandon dyn VfsNode in favor of an enum VfsNode, similar to what is currently proposed in #2096. I need to discuss that PR internally to see whether this would also be viable here.
Alternatively, we could create an enum VfsOp and enum VfsOpRet for VFS operations that we could use while not migrating away from dyn VfsNode. This does not feel very ergonomic to me, though.
If components: &mut Vec<&str> is the main thing that you want to clean up, we can easily merge a PR that migrates this to components: &[&str] or path: &str instead.
|
A central problem this PR tries to solve is that we need to duplicate references (and cloning Also, even if one could (and the code would be almost unreadable (tho localized) afaik, due to usage of self-referencing structs), it would hold onto locks longer than necessary, hampering performance. I see little benefit in trying to pursue this "holding onto all the locks along the way" method, because it is really hard to implement correctly, hard to read and maintain, and wouldn't benefit performance particularly either afaik, given that locking is usually slower than memory allocation anyways. |
| Ok(Box::new(Self { | ||
| original_prefix: Arc::clone(&self.original_prefix), | ||
| prefix, | ||
| attr: todo!(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not copy the original attr down?
Otherwise, we never really use this attr anyways, so why keep it around?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is not proposing we do it like this. I just reduced your changes so I can understand what and why you are doing them. You are welcome to modify as necessary. Just keep your individual commits minimal, please.
It's not clear to me why you want to avoid the recursion. I have proposed multiple alternatives for separating traversal from operations that still involve recursion. And if we'd want to solve this with returning cheaply clonable nodes, why not use |
|
That is a good point ( I can do that. I want to primarily avoid 2 things:
I also want to note that albeit this introduces an allocation for |
|
All right. I will put up two PRs in preparation, first migrating from We can discuss the benefits of the structural rework that you have in mind for this one afterward. A clear motivation should be in the PR description, and a benchmark would further underline your performance claims. |
I realized that by "just" factoring out all path traversal handling, I don't even need to introduce an
std::path::Pathequivalent.The afaik only disadvantage that this has is the re-allocations during traversal into
fuseanduhyvefilesystems.