Run plot comms on the R thread by lionel- · Pull Request #1100 · posit-dev/ark

lionel- · 2026-03-10T18:05:53Z

Progress towards #689
Branched from #1099
Addresses posit-dev/positron#12825 (planning to close)

I did not foresee all the simplifications falling out of this one!

Plots no longer live in their own threads, they live on the R thread. That frees up memory since each background thread takes about 2mb of stack space (depending on platform), so 20 open plots would take 40mb of memory.
DeviceContext moves from a standalone thread_local! into a regular field of Console. All access goes through Console::get().device_context(). The CommHandlerContext provides access to Console via ctx.console(), so plot handlers reach the device context through a controlled context chain.
The old idle-time polling loop (process_rpc_requests with crossbeam::Select) is removed. Plot RPCs are now dispatched synchronously through the shell handler on the R thread. This improves plot latency when pre-renderings require adjustments because we no longer timeout-poll for plot updates (related to Performance for plotting positron#5184).
The GraphicsDeviceNotification async channel is removed. Since the UI comm now runs on the R thread (from Run UI comm on the R thread #1099), prerender settings are updated synchronously via Console::get().device_context().set_prerender_settings(). This removes quite a bit of plumbing.
Fixes a dev.hold() regression I introduced with pre-renderings (Send pre-renderings of new plots to the frontend #775). We were previously sending pre-renderings unconditionally, now they are held until dev.flush(). This is tested by new integration tests.

DavisVaughan

I have paused on review until we can get these regressions resolved

DavisVaughan · 2026-03-16T20:13:06Z

        dap: Arc<Mutex<Dap>>,
        session_mode: SessionMode,
    ) -> Self {
+        let device_context = DeviceContext::new(iopub_tx.clone());


It makes sense to me to remove iopub_tx from DeviceContext now and only pull iopub_tx directly from Console

Over in graphics_device.rs, you can just Console::get().iopub_tx() anytime you need it now!

It feels like this better scopes responsibilities to me

I feel the opposite. It hides that the device context directly communicates with the frontend via IOPub.

Also ideally we should avoid Console::get() calls as these bypass Rust's ownership model. But then the issue is that the device belongs to console and not the other way around. Cloning the IOPub channel is the right ownership structure from that point of view.

It hides that the device context directly communicates with the frontend via IOPub

Hmm I see what you mean

Like the DeviceContext struct should have an "include what you use" kind of policy in the struct fields, and by not having iopub_tx in there, we mask what it uses

DavisVaughan · 2026-03-16T20:15:46Z

+        Console::get()
+            .device_context()
+            .set_prerender_settings(params.settings);


Is it possible that we could now make DeviceContext have non-Cell internals?

I am disturbed by being able to call get() to then perform a mutable update, rather than being forced to go through get_mut()

The Cell is actually more reassuring, especially regarding the possibility of reentrancy (flushing calls process_changes which can recurse into graphics device callbacks).

In general get() is not a signal that the state is not mutable, it's only a signal that the state is shared, and can still be mutable e.g. through Cells or other mechanisms.

It's all a bit hand wavy considering that our callbacks might produce multiple active get_mut() at the same time, but adding to the unsoundness would be... unsound ;)

DavisVaughan · 2026-03-16T20:30:22Z

-        // Don't try to render a plot if we're currently drawing.
-        if self.is_drawing.get() {
-            log::trace!("Refusing to render due to `is_drawing`");
-            return;
-        }
-
-        // Don't try to render a plot if someone is asking us not to, i.e. `dev.hold()`
-        if !self.should_render.get() {
-            log::trace!("Refusing to render due to `should_render`");
-            return;
-        }


I am pretty sure these are important

Since we now process the requests when R is idle, is_drawing is structurally false, we're never mid-draw.

Regarding should_render (matching dev.hold()), it's true that holds can span across prompts, I can see that happening when developing a plot function interactively. This would require:

Creating a plot that Positron knows about (opened comm)

Calling dev.hold() in the console

Resizing the plot in the frontend UI

So that's a very narrow case. We show the uncommitted visual changes in that case, which seems like a reasonable compromise. The alternative would be to refuse to render, producing a distorted plot in the frontend after resize.

lionel- · 2026-04-03T12:23:38Z

I also fixed a race between comm_open and subsequent comm_msg, the tests revealed it was possible to get the latter before the former. That's because the Shell thread is in charge of sending comm_open on IOPub, so we had a triangular configuration where the R thread and the Shell thread were racing to send their messages on IOPub.

To fix this, I made comm opening blocking. There is a new CommEvent::Barrier variant to support this. On the Shell side, the shell thread now uses a select loop that drains comm events while a request is in flight.

A nice consequence of this is that the frontend now receives new plots as soon as they are created, rather than at the end of a loop:

for (i in 1:3) {
  plot(i)
  Sys.sleep(1)
}

I also noticed that posit-dev/positron#6736 is broken again.
I've opened posit-dev/positron#12825 for this.

And of course this is all now under test coverage.

DavisVaughan

Two main comments to discuss, other things are minor:

Not sure about passing console through handle_msg and friends
Not sure about CommEvent::Barrier vs passing done through CommEvent::Opened

See below for more on both

My test file seems to be working fine now

DavisVaughan · 2026-04-03T16:57:40Z

+            let ready = sel.ready();
+
+            while let Ok(event) = self.comm_event_rx.try_recv() {
+                self.process_comm_event(event);
+            }
+
+            if ready == resp_idx {
+                break response_rx.recv().unwrap();
+            }


So even if sel.ready() returns the execute response as the first "ready" thing, you want to go off and process the comm_events first?

I think I was expecting to see

if ready == resp_idx { break response_rx.recv().unwrap(); }

right after let ready =

DavisVaughan · 2026-04-03T17:21:51Z

+        // Block until Shell has processed the Opened event, ensuring the
+        // `comm_open` message is on IOPub before we return. Any updates


Just a gut check, but is this really what it ensures? In CommEvent::Opened we do an iopub_tx.send() call, but that doesn't necessarily mean that IOPub has received it by the end of CommEvent::Opened and the following CommEvent::Barrier event is processed.

I'm not sure if that matters.

DavisVaughan · 2026-04-03T17:24:03Z

        dap: Arc<Mutex<Dap>>,
        session_mode: SessionMode,
    ) -> Self {
+        let device_context = DeviceContext::new(iopub_tx.clone());


It hides that the device context directly communicates with the frontend via IOPub

Hmm I see what you mean

Like the DeviceContext struct should have an "include what you use" kind of policy in the struct fields, and by not having iopub_tx in there, we mask what it uses

DavisVaughan · 2026-04-03T17:25:22Z

+    #[cfg(feature = "testing")]
+    pub fn test_init() {


Can you put this up with start() and call it test_start()?

DavisVaughan · 2026-04-03T17:40:43Z

    }

-    fn handle_msg(&mut self, msg: CommMsg, ctx: &CommHandlerContext) {
+    fn handle_msg(&mut self, msg: CommMsg, ctx: &CommHandlerContext, _console: &Console) {


I'm not sure I agree with passing console through all these handlers

It seems like the only current use that I can see is in impl CommHandler for PlotComm for handle_msg() and handle_close().

But those could just do Console::get().

It doesn't seem like it helps clarify ownership very much, because so many other things in graphics_device.rs also just do Console::get() already. And it seems likely that in handle_rpc() (called by handle_msg()) we could very easily just do another Console::get() rather than passing through the &Console we got from handle_msg(). It seems complex to me to know when to use which pattern.

It also seems like this _console: &Console argument is the reason for all of the new test_init() infra, and overall this just seems like it adds quite a bit of complexity for (to me) not a lot of obvious benefit 😢

As an example, should on_did_execute_request() gain a console: &Console argument and thread it through process_changes -> process_new_plot -> should_use_dynamic_plots to the Console::get call there?

I don't think so, I feel like it makes sense that we just invoke Console::get() from there as required like we do everywhere else.

But now introducing this new pattern of handle_msg() and friends having a console passed through confuses me and makes me question how we should do things in all the other cases, and I'm not sure that's worth it

DavisVaughan · 2026-04-03T18:03:04Z

 /// but we can still render this intermediate state and show it to the user until
 /// they add more plots or advance to another new page.
 #[tracing::instrument(level = "trace", skip_all)]
 pub(crate) fn on_did_execute_request() {


I think on_did_execute_request() should perhaps be a &self method

It is only called in handle_active_request(), but we have Console there as self, hence we also have DeviceContext, so we can just call self.device_context().on_did_execute_request() from there?

But I am also a bit weirded out by the ownership implications of all this (both the current state of things and my proposal)

self.device_context().on_did_execute_request() would call DeviceContext::process_changes(), but that goes through process_new_plot() and should_use_dynamic_plots() and you end up accessing the "parent" owner of DeviceContext from there via Console::get(), which feels very mind boggling!

DavisVaughan · 2026-04-03T18:11:22Z

-        // Save our new socket.
-        // Refcell Safety: Short borrows in the file.
-        self.sockets.borrow_mut().insert(id.clone(), socket);
+        match Console::get_mut().comm_open_backend(PLOT_COMM_NAME, Box::new(plot_comm)) {


I think this is another place where it feeeels weird that DeviceContext is a field of Console, yet refers to Console as well

DavisVaughan · 2026-04-03T18:16:40Z

 }

-pub(crate) struct Console {
+pub struct Console {


DavisVaughan · 2026-04-03T18:26:55Z

@@ -699,3 +699,481 @@ fn test_plot_default_size_without_metadata() {
    frontend.recv_iopub_idle();


Random note that a cargo build --release is throwing this for me right now

warning: unused import: `libr::SEXP` --> crates/ark/src/r_task.rs:19:5 | 19 | use libr::SEXP; | ^^^^^^^^^^

I think this type is only used in debug builds?

DavisVaughan · 2026-04-03T18:29:51Z

    frontend.recv_shell_execute_reply();
 }
+
+/// Test that `dev.hold()` suppresses intermediate plot output.


It looks like plot(lm(disp ~ drat, data = mtcars)) is working again, but I dont see a backend test for it?

lionel- · 2026-04-03T22:03:52Z

Ok let's discuss Console access by comms in our next sync

Better reflects the borrowing model

lionel- · 2026-04-17T12:27:35Z

Edit: Moved to #1161

Updates following our sync (progress towards #1145).

Console::get() pushed to the FFI boundary

The graphics device callbacks are plain C function pointers - there's no way to pass Rust state through them. Console::get() is unavoidable there. But everything past that boundary now receives &Console explicitly. The C callbacks are thin shims: capture Console::get() once, pass it inward.

DeviceContext no longer reaches back into Console

Previously DeviceContext (a field of Console) called Console::get() internally for should_use_dynamic_plots, capture_execution_context, and comm_open_backend. Now:

session_mode is stored on DeviceContext directly (it's static, set once at startup)
capture_execution_context returns a default when no context was pushed, removing the Console fallback entirely
comm_open_backend takes &self (via RefCell on comms), so DeviceContext calls it through the &Console parameter it already has

PlotComm owns what it needs

PlotComm is the only CommHandler that used the console: &Console parameter (for device_context()). Instead of threading Console through the entire CommHandler trait, PlotComm holds an Rc<DeviceContext>. The console parameter is gone from the trait entirely. Other handlers that need Console access (RDataExplorer opening sub-explorers) call Console::get() themselves - they're at a boundary where that's appropriate.

Comm dispatch is &self throughout

All methods in console_comm.rs take &self. Handler dispatch uses a take/remove pattern: the comm is temporarily removed from its RefCell storage before calling the handler, so no borrow guard is held during the call. This prevents panics if the handler reenters the RefCell (e.g. RDataExplorer calling comm_open_backend). The pattern is encapsulated in helpers (with_ui_comm_mut, with_comm_mut, etc.).

Shell deadlock fix

comm_open_backend blocks on a barrier in CommEvent::Opened to guarantee IOPub ordering. Previously, if a handler called comm_open_backend during comm_msg dispatch, Shell was blocked waiting for the handler to finish and couldn't process the barrier. Now handle_comm_msg, handle_comm_close, and handle_comm_open return a completion receiver instead of blocking. Shell select-loops on it, draining comm events while the handler runs - same pattern that handle_execute_request already used. handle_comm_open is also sync now (was async behind block_on).

Changelog

Explicit &Console on DeviceContext methods**: process_changes, hooks, and their callees receive console: &Console as a parameter instead of calling Console::get() internally. C callbacks and #[harp::register] functions are the only place Console::get() appears.
on_execute_request / on_did_execute_request moved to Console methods in a new console_graphics.rs file, replacing free functions that called Console::get().
DeviceContext wrapped in Rc: Console holds Rc<DeviceContext>. PlotComm gets its own Rc<DeviceContext> at construction, giving it direct access without needing &Console.
console: &Console removed from CommHandler trait: No handler receives &Console anymore. PlotComm uses its Rc<DeviceContext>. Other handlers that need Console access call Console::get() themselves.
session_mode stored on DeviceContext: Passed at construction time instead of queried from Console on every should_use_dynamic_plots call.
capture_execution_context simplified: The Console fallback (console.get_execution_context()) removed. Returns ExecutionContext::default() when no context was pushed. get_execution_context method on Console removed as dead code.
comms field wrapped in RefCell: Enables comm_open_backend to take &self instead of &mut self, so DeviceContext can call it through its &Console parameter.
UI comm stored in its own RefCell field: Separated from the comms HashMap so ui_comm() can borrow independently.
comm_id added to ConsoleComm: The UI comm (stored outside the HashMap) carries its own ID for message routing. Eliminated the UiCommEntry wrapper type.
All console_comm.rs dispatch methods take &self: Use take/remove pattern via helpers (with_ui_comm_mut, with_comm_mut, take_ui_comm, take_comm, etc.) to avoid holding RefCell borrows during handler dispatch.
Shell deadlock fix: handle_comm_msg, handle_comm_close, and handle_comm_open on the ShellHandler trait return a completion Receiver<()> instead of blocking. Amalthea's Shell select-loops on it via drain_comm_events_until to process CommEvents while the handler runs.
handle_comm_open made sync: Was async (always called via block_on). Now sync with completion receiver, consistent with handle_comm_msg / handle_comm_close.
CommEvent::Barrier folded into CommEvent::Opened: The barrier Sender<()> is now an optional third field on Opened, eliminating a separate event variant.
drain_comm_events_until extracted as generic helper: Used by handle_execute_request, handle_comm_msg_request, handle_comm_close_request, and handle_comm_open_request via handle_comm_notification.

lionel- force-pushed the task/sync-plot-comm branch 2 times, most recently from 34baa70 to 845b4a7 Compare March 11, 2026 07:08

lionel- requested a review from DavisVaughan March 11, 2026 13:47

DavisVaughan requested changes Mar 16, 2026

View reviewed changes

lionel- force-pushed the task/sync-ui-comm branch 3 times, most recently from bd54059 to be02717 Compare March 20, 2026 06:19

Base automatically changed from task/sync-ui-comm to main March 20, 2026 06:50

lionel- force-pushed the task/sync-plot-comm branch 2 times, most recently from 7386726 to be3d85c Compare March 31, 2026 13:24

lionel- mentioned this pull request Mar 31, 2026

Add inline data explorers to Ark #1124

Merged

lionel- force-pushed the task/sync-plot-comm branch from be3d85c to 74abf66 Compare April 3, 2026 09:00

lionel- requested a review from DavisVaughan April 3, 2026 12:13

DavisVaughan approved these changes Apr 3, 2026

View reviewed changes

lionel- added 14 commits April 17, 2026 14:09

Run plot comms synchronously on the R thread

12aeef6

Fix dev.hold() issue introduced by pre-renderings

03926f0

Move device context to Console

92f3ed2

Set Prerender settings synchronously from the UI handler

a679edd

Fix message interpolation

959bcb0

Update stale comment

8a7eecf

Address code review

6308311

Pass Console reference to handlers directly

7573327

Better reflects the borrowing model

Add comment about the need for Cell

fa4df0b

Remove unneeded warmup

38bd585

Always record plot even if withheld

2194a31

Ignore UI comm busy events by default in tests

0c42552

Fix synchronisation between comm opening and plot updates

ce50616

Fix semantic conflicts from rebase

9d61505

Fix workspace inheritance

06b0cc8

lionel- force-pushed the task/sync-plot-comm branch from 49a6715 to 9a0725f Compare April 17, 2026 12:16

lionel- force-pushed the task/sync-plot-comm branch 2 times, most recently from a247319 to 06b0cc8 Compare April 22, 2026 07:14

Another rebase fix

4138231

lionel- mentioned this pull request Apr 22, 2026

Review Console reference management #1161

Open

		// Block until Shell has processed the Opened event, ensuring the
		// `comm_open` message is on IOPub before we return. Any updates

		@@ -699,3 +699,481 @@ fn test_plot_default_size_without_metadata() {
		frontend.recv_iopub_idle();

Conversation

lionel- commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavisVaughan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lionel- commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DavisVaughan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lionel- commented Apr 3, 2026

Uh oh!

lionel- commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lionel- commented Mar 10, 2026 •

edited

Loading

DavisVaughan left a comment •

edited

Loading

lionel- commented Apr 3, 2026 •

edited

Loading

lionel- commented Apr 17, 2026 •

edited

Loading