Skip to content

Conversation

@fogti
Copy link
Contributor

@fogti fogti commented Jan 10, 2026

I realized that by "just" factoring out all path traversal handling, I don't even need to introduce an std::path::Path equivalent.

The afaik only disadvantage that this has is the re-allocations during traversal into fuse and uhyve filesystems.

@mkroening mkroening self-assigned this Jan 10, 2026
@fogti fogti force-pushed the path branch 3 times, most recently from 19bd9ce to aef532d Compare January 10, 2026 15:13
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark Results

Details
Benchmark Current: b1dec3a Previous: af77eb8 Performance Ratio
startup_benchmark Build Time 100.26 s 96.76 s 1.04
startup_benchmark File Size 0.86 MB 0.82 MB 1.06
Startup Time - 1 core 0.93 s (±0.03 s) 0.93 s (±0.03 s) 1.00
Startup Time - 2 cores 0.95 s (±0.03 s) 0.93 s (±0.03 s) 1.02
Startup Time - 4 cores 0.96 s (±0.03 s) 0.95 s (±0.04 s) 1.02
multithreaded_benchmark Build Time 97.67 s 94.96 s 1.03
multithreaded_benchmark File Size 0.94 MB 0.96 MB 0.98
Multithreaded Pi Efficiency - 2 Threads 88.56 % (±6.97 %) 88.34 % (±8.83 %) 1.00
Multithreaded Pi Efficiency - 4 Threads 43.85 % (±3.68 %) 43.96 % (±2.87 %) 1.00
Multithreaded Pi Efficiency - 8 Threads 25.13 % (±1.88 %) 25.67 % (±1.88 %) 0.98
micro_benchmarks Build Time 107.12 s 107.03 s 1.00
micro_benchmarks File Size 0.95 MB 0.97 MB 0.98
Scheduling time - 1 thread 67.68 ticks (±2.22 ticks) 68.40 ticks (±2.80 ticks) 0.99
Scheduling time - 2 threads 37.00 ticks (±4.58 ticks) 36.75 ticks (±3.77 ticks) 1.01
Micro - Time for syscall (getpid) 2.86 ticks (±0.23 ticks) 3.72 ticks (±0.30 ticks) 0.77
Memcpy speed - (built_in) block size 4096 66828.35 MByte/s (±47877.34 MByte/s) 65936.06 MByte/s (±46962.77 MByte/s) 1.01
Memcpy speed - (built_in) block size 1048576 29708.25 MByte/s (±24486.41 MByte/s) 29711.39 MByte/s (±24468.80 MByte/s) 1.00
Memcpy speed - (built_in) block size 16777216 28205.93 MByte/s (±23478.11 MByte/s) 28572.31 MByte/s (±23801.85 MByte/s) 0.99
Memset speed - (built_in) block size 4096 67725.05 MByte/s (±48504.76 MByte/s) 66224.66 MByte/s (±47148.82 MByte/s) 1.02
Memset speed - (built_in) block size 1048576 30499.28 MByte/s (±24944.92 MByte/s) 30455.87 MByte/s (±24906.53 MByte/s) 1.00
Memset speed - (built_in) block size 16777216 28995.09 MByte/s (±23937.42 MByte/s) 29305.64 MByte/s (±24204.32 MByte/s) 0.99
Memcpy speed - (rust) block size 4096 61019.62 MByte/s (±44977.96 MByte/s) 59811.41 MByte/s (±43763.94 MByte/s) 1.02
Memcpy speed - (rust) block size 1048576 29566.78 MByte/s (±24444.76 MByte/s) 29629.02 MByte/s (±24504.10 MByte/s) 1.00
Memcpy speed - (rust) block size 16777216 28204.62 MByte/s (±23468.87 MByte/s) 28401.51 MByte/s (±23669.15 MByte/s) 0.99
Memset speed - (rust) block size 4096 62083.81 MByte/s (±45708.12 MByte/s) 60851.37 MByte/s (±44540.62 MByte/s) 1.02
Memset speed - (rust) block size 1048576 30351.75 MByte/s (±24896.14 MByte/s) 30426.48 MByte/s (±24959.19 MByte/s) 1.00
Memset speed - (rust) block size 16777216 28986.80 MByte/s (±23926.95 MByte/s) 29175.93 MByte/s (±24111.52 MByte/s) 0.99
alloc_benchmarks Build Time 103.86 s 102.72 s 1.01
alloc_benchmarks File Size 0.94 MB 0.89 MB 1.05
Allocations - Allocation success 100.00 % 100.00 % 1
Allocations - Deallocation success 100.00 % 100.00 % 1
Allocations - Pre-fail Allocations 100.00 % 100.00 % 1
Allocations - Average Allocation time 10955.66 Ticks (±172.18 Ticks) 8547.70 Ticks (±113.80 Ticks) 1.28
Allocations - Average Allocation time (no fail) 10955.66 Ticks (±172.18 Ticks) 8547.70 Ticks (±113.80 Ticks) 1.28
Allocations - Average Deallocation time 1097.60 Ticks (±709.84 Ticks) 1117.79 Ticks (±513.75 Ticks) 0.98
mutex_benchmark Build Time 104.58 s 111.82 s 0.94
mutex_benchmark File Size 0.95 MB 0.97 MB 0.98
Mutex Stress Test Average Time per Iteration - 1 Threads 12.72 ns (±0.72 ns) 13.38 ns (±0.85 ns) 0.95
Mutex Stress Test Average Time per Iteration - 2 Threads 15.50 ns (±0.67 ns) 15.68 ns (±0.79 ns) 0.99

This comment was automatically generated by workflow using github-action-benchmark.

@fogti fogti force-pushed the path branch 2 times, most recently from ddbba05 to 6c0b8d6 Compare January 14, 2026 11:46
prefix: Option<String>,
attr: FileAttr,
original_prefix: Arc<str>,
prefix: String,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to instead only store the part in prefix that is actually "beyond" original_prefix.

@fogti fogti force-pushed the path branch 3 times, most recently from ba1d3f4 to 34a1489 Compare January 27, 2026 14:30
Comment on lines +1167 to +1179
let mut prefix = self.prefix.clone();
// this part prevents inserting double-slashes or no slashes between prefix and path
if !path.is_empty() {
if let Some(x) = path.strip_prefix("/") {
path = x;
} else {
return Err(Errno::Nosys);
}
if !prefix.is_empty() {
prefix.push('/');
}
}
prefix.push_str(path);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new, and should be specifically reviewed.

mkroening and others added 2 commits February 8, 2026 10:09
Co-authored-by: Martin Kröning <martin.kroening@eonerc.rwth-aachen.de>
Copy link
Member

@mkroening mkroening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! :)

PRs are very difficult to review when one commit changes so much at once. I amended two commits to make discussing the main change of this more reviewable. I also extracted a few changes into another PR (#2229) to merge them separately. Small commits not only help the reviewer with comprehending the diff but also help Git show nicer diffs, since each diff has more anchor points that did not change.

I'll try to summarize the change to discuss it. Please add this summary or a corrected version if I got something wrong to the PR description. This helps immensely with understanding the changes and their motivation and is a necessary foundation for discussing the overall approach. Any alternatives that have been considered before opening the PR are also important to put into the PR description. Ultimately, PR authors should do a self-review before submitting a PR to ensure reviewability.

Extracting the core change, understanding it, and guessing the motivation took me a lot of time. For you, providing this info would have been much easier, since you know the ins and outs of the changes and your motivation. Please make sure to include these things in future PRs. Otherwise they will not get reviewed or merged.

Something like this should go into the PR description:

PR Description

Currently, VfsNode has many functions that do recursive traversal as well as their respective operation:

  • traverse_mkdir
  • traverse_rmdir
  • traverse_unlink
  • traverse_readdir
  • traverse_lstat
  • traverse_stat
  • traverse_mount
  • traverse_open
  • traverse_create_file

This PR makes these functions more maintainable by separating VFS traversal from the actual operations, heavily deduplicating path traversal logic.

This is done by adding three new methods to VfsNode:

  • fn traverse_once(&self, component: &str) -> io::Result<Box<dyn VfsNode>>;
  • fn traverse_multiple(&self, component: &str) -> io::Result<Box<dyn VfsNode>>;
  • fn dup(&self) -> Box<dyn VfsNode>;

dup clones the object like Clone::clone but returns Box<dyn VfsNode> instead of Self to remain dyn-compatible.

dup is required since subnodes may be a different dyn VfsNode that we return by value in the traversal functions.

This allows us for each operation to first traverse the VFS and then do the operation separately, successfully decoupling the two. Thus, this PR correspondingly changes all operation methods in the following way:

fn traverse_operation(&self, components: &mut Vec<&str>) -> io::Result<ReturnValue>

becomes

fn operation(&self, component: &str) -> io::Result<ReturnValue>

PR Review

Is dup not the same as Clone::clone in all current implementations? Why invent a new name? If this really is necessary, using the dyn_clone crate would be preferable.

More generally, I would like to avoid the reallocations and not have a dup method or use dyn_clone. This means not returning something via ownership but working on references.

What about still doing traversal recursively like before but also decoupling traversal from the operations?

Ideally, this would look something like this:

fn traverse_with<F, R>(&self, path: &str, f: F) -> io::Result<R>
where
    F: FnOnce(&dyn VfsNode) -> R;

Unfortunately, this would not be dyn-safe. This could be viable in the future, though if we chose to abandon dyn VfsNode in favor of an enum VfsNode, similar to what is currently proposed in #2096. I need to discuss that PR internally to see whether this would also be viable here.

Alternatively, we could create an enum VfsOp and enum VfsOpRet for VFS operations that we could use while not migrating away from dyn VfsNode. This does not feel very ergonomic to me, though.

If components: &mut Vec<&str> is the main thing that you want to clean up, we can easily merge a PR that migrates this to components: &[&str] or path: &str instead.

@fogti
Copy link
Contributor Author

fogti commented Feb 8, 2026

A central problem this PR tries to solve is that we need to duplicate references (and cloning Arcs is relatively cheap) to avoid the recursion during traversal, because holding onto all the nested locks necessary to maintain references is really difficult to do otherwise (I might not have tried hard enough so far, I'll take a look).

Also, even if one could (and the code would be almost unreadable (tho localized) afaik, due to usage of self-referencing structs), it would hold onto locks longer than necessary, hampering performance.

I see little benefit in trying to pursue this "holding onto all the locks along the way" method, because it is really hard to implement correctly, hard to read and maintain, and wouldn't benefit performance particularly either afaik, given that locking is usually slower than memory allocation anyways.

Ok(Box::new(Self {
original_prefix: Arc::clone(&self.original_prefix),
prefix,
attr: todo!(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not copy the original attr down?

Otherwise, we never really use this attr anyways, so why keep it around?

Copy link
Member

@mkroening mkroening Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit is not proposing we do it like this. I just reduced your changes so I can understand what and why you are doing them. You are welcome to modify as necessary. Just keep your individual commits minimal, please.

@mkroening
Copy link
Member

A central problem this PR tries to solve is that we need to duplicate references (and cloning Arcs is relatively cheap) to avoid the recursion during traversal, because holding onto all the nested locks necessary to maintain references is really difficult to do otherwise (I might not have tried hard enough so far, I'll take a look).

It's not clear to me why you want to avoid the recursion. I have proposed multiple alternatives for separating traversal from operations that still involve recursion.

And if we'd want to solve this with returning cheaply clonable nodes, why not use Arc<dyn VfsNode> instead of Box<dyn VfsNode> to show that this is always cheap?

@fogti
Copy link
Contributor Author

fogti commented Feb 8, 2026

That is a good point (Arc instead of Box). Doing so would be even more complicated because for that we currently have the Arc's in the hierarchy at the wrong places, so in contrast to the current state of this PR, it would involve actually restructuring the involved directory structures to a large extent.

I can do that.

I want to primarily avoid 2 things:

  1. Holding of a lock guard per traversed directory along the path simultaneously. This is imo more important than avoiding recursion itself, particularly because the factoring-out of block_on to the top-level filesystem entry points have alleviated concerns about potential scheduling churn, etc.
  2. Avoiding the recursion itself: This is beneficial because it is easy to do and imo makes the code more readable, and it allows omitting an allocation for the recursive async closures, which would need boxing to be able to await them.

I also want to note that albeit this introduces an allocation for dup-in (cloning) references to directory / file nodes, this replaces (via 2.) a stack of nested allocations (not even tail-call optimization can help with that, given how convoluted some of the traversal functions were, and that they need to refer to the stack-bound lock guards which have non-trivial destructors) with short-lived allocations for dup-ed references.

@mkroening
Copy link
Member

All right. I will put up two PRs in preparation, first migrating from &mut Vec<&str> to &str and then decoupling traversal from operations via a standalone generic function. These changes should be fairly uncontroversial, and we should take small steps towards this one.

We can discuss the benefits of the structural rework that you have in mind for this one afterward. A clear motivation should be in the PR description, and a benchmark would further underline your performance claims.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants