Skip to content

Conversation

@coolreader18
Copy link
Collaborator

Description of Changes

This got a little bigger than I had hoped, but I think it's still pretty manageable. This PR partially reverts back to before #3263: cores are re-balanced with a watch::Receiver<CoreId> that each database thread will listen for updates on in order to repin itself, and multiple OS threads (each matched to a database) can be pinned to one core. As I understand it, that second part is something Phoebe was trying to avoid, but given that there's no way to asyncify a JS module, it's kind of necessary.

JS is single-threaded, and uses cooperative rather than preemptive multitasking (callbacks/async, not green threads). That means that if a JS function has an infinite loop, no other event handlers would be able to run unless that loop were to exit. Coupled with the fact that we can't Send a v8 isolate across threads, it makes more sense to keep the module host on one thread and repin that thread as needed. An alternative option, as was brought up, would be to deconstruct and reconstruct the module onto a different thread when needed, since load-balancing won't be happening often anyway.

Expected complexity level and risk

3 - reworks the threadpool that databases run on, and so could lead to deadlocks or other concurrency bugs. However, that seems unlikely, since this separates databases each onto their own thread, and as such decreases the likelihood of them interacting poorly with each other.

Testing

Not sure if there's anything specific I should do, since this doesn't change behavior.

@coolreader18
Copy link
Collaborator Author

Can confirm that this improves performance; just this patchset brings v8 from 78.5k TPS to 89k TPS.

Copy link
Contributor

@Centril Centril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some requests for more docs.

}
}

pub async fn run_on_thread<F, R>(&mut self, f: F) -> R
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some documentation would be good here.

// We care about optimizing for `CallReducer` as it happens frequently,
// so we don't want to box anything in it.
enum JsWorkerRequest {
/// See [`JsInstance::run_on_thread`]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// See [`JsInstance::run_on_thread`]
/// See [`JsInstance::run_on_thread`].

Comment on lines +542 to +543
load_balance_guard: Arc<LoadBalanceOnDropGuard>,
mut core_pinner: CorePinner,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters could be incorporated into the function docs.

Comment on lines +208 to +248
#[derive(Default)]
pub struct AllocatedJobCore {
pub guard: LoadBalanceOnDropGuard,
pub pinner: CorePinner,
}

impl AllocatedJobCore {
pub fn spawn_async_executor(self) -> SingleCoreExecutor {
SingleCoreExecutor::spawn(self)
}
}

#[derive(Default, Clone)]
pub struct CorePinner {
move_core_rx: Option<watch::Receiver<CoreId>>,
}

impl CorePinner {
pub fn pin_now(&mut self) {
if let Some(move_core_rx) = &mut self.move_core_rx {
let core_id = *move_core_rx.borrow_and_update();
core_affinity::set_for_current(core_id);
}
}
pub fn pin_if_changed(&mut self) {
if let Some(move_core_rx) = &mut self.move_core_rx {
if let Ok(true) = move_core_rx.has_changed() {
let core_id = *move_core_rx.borrow_and_update();
core_affinity::set_for_current(core_id);
}
}
}
pub async fn run(self) {
if let Some(mut move_core_rx) = self.move_core_rx {
while move_core_rx.changed().await.is_ok() {
let core_id = *move_core_rx.borrow_and_update();
core_affinity::set_for_current(core_id);
}
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs on these please :)

impl PinnedCoresExecutorManager {
/// Get a [`runtime::Handle`] for running database operations on,
/// Get a core for running database operations on,
/// and store state in `self` necessary to move that database to a new runtime
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doc comment needs to be changed; databases no longer move between runtimes.

}

/// A module; used as a bound on `InstanceManager`.
trait GenericModule {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a trait here? There's only two kinds of module, not an unbounded set of them.

Comment on lines +746 to +748
struct ModuleInstanceManager<M: GenericModule> {
instances: Mutex<VecDeque<M::Instance>>,
module: M,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we rewriting this to use trait polymorphism rather than tagged enums?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It more accurately models the data it holds. We're never going to have a ModuleInstanceManager where the module is Wasm but the instances are Js, or where the elements of instances are heterogenous. Now we don't have to do let Instance::Js(inst) = inst else { unreachable!() };, and as a nice side benefit the VecDeque is more compact (again because of the type system–enforced homogeneity).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't object to this specifically. My main concern here is that we not regress to the previous version of instance managers which used dyn Trait runtime polymorphism.

In general, I urge you to keep an eye on when using more complex types to more accurately model data provides a tangible benefit, versus when it adds needless complexity. In this case, I do not think that let inst = inst.into_js().unwrap(); would be likely to lead to confusion or bugs, nor do I believe that reducing the VecDeque's size is likely to tangibly affect our performance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, I agree with @coolreader18. Enforcing the common interface is useful and keeps things in sync.
Type classes are not just about polymorphism, but also about the restrictions they impose and the "laws" that come with them.

Comment on lines +243 to +244
let core_id = *move_core_rx.borrow_and_update();
core_affinity::set_for_current(core_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you make a function that does this to not duplicate 3x? Make it #[inline] also please.

core_affinity::set_for_current(core_id);
}
}
pub fn pin_if_changed(&mut self) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this turns out to be critical for performance. The difference is about 6k TPS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants