Skip to content

vello_hybrid: Optimize full viewport rectangles that act as a background#1622

Open
LaurenzV wants to merge 2 commits into
mainfrom
laurenz/set_bg
Open

vello_hybrid: Optimize full viewport rectangles that act as a background#1622
LaurenzV wants to merge 2 commits into
mainfrom
laurenz/set_bg

Conversation

@LaurenzV
Copy link
Copy Markdown
Collaborator

@LaurenzV LaurenzV commented May 5, 2026

In a lot of cases, it is common to draw a full viewport-sized (or lager) rectangle which acts as a background. For example, when rendering a PDF you usually always render them against a white background. This PR adds an optimization such that if this case is detected, instead of actually drawing a fully-sized rectangle we instead piggyback the clear operation that is applied at the start of each render operation to clear the canvas to the given color. While it might always provide a notable benefit since drawing rectangles is relatively cheap, as we have noticed, on low tier devices any saved fragment shader invocations can potentially be very helpful!

@LaurenzV LaurenzV requested a review from grebmeg May 5, 2026 13:00
@LaurenzV
Copy link
Copy Markdown
Collaborator Author

LaurenzV commented May 5, 2026

I guess for the blending code path this once again isn't as straight-foward. 😣 Putting this into draft for now.

@LaurenzV LaurenzV marked this pull request as draft May 5, 2026 13:26
@LaurenzV LaurenzV changed the title vello_hybrid: Add a set_background method vello_hybrid: Optimize full viewport rectangles that act as a background May 7, 2026
@laurenz-canva laurenz-canva force-pushed the laurenz/set_bg branch 4 times, most recently from fdd2429 to 223ac6a Compare May 7, 2026 15:00
@LaurenzV LaurenzV marked this pull request as ready for review May 7, 2026 15:02
@LaurenzV
Copy link
Copy Markdown
Collaborator Author

LaurenzV commented May 7, 2026

I've repurposed this now to make the detection automatic instead of requiring the user to call a set_background method.

|| self.wide.has_layers()
|| self.wide.has_coarse_batches()
|| self.clip_context.get().is_some()
|| !is_axis_aligned(&self.render_state.transform)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically we don't need this condition. It could be any transformation as long as the transformed rect contains all four corners of the viewport. This could be done by either

  • applying the inverse transformation to four corners of viewport then calculate whether they are all within this rect, or
  • check the transformed rect as a polygon contains all four corners.

(This precise check isn't necessary for our use case though. In our use case this optimization would be able to apply even if it says the transformation must be identity.)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that's true, but not sure it's worth the trouble. 😄 I already tested it on our side and as soon as default blending scene constraints are enabled, the optimization kicked off on our side.

&& transformed_rect.x1 >= f64::from(self.width)
&& transformed_rect.y1 >= f64::from(self.height)
{
self.background = Some(PremulColor::from_alpha_color(*color));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with how Vello works, but I wonder if we are putting it as a background here, wouldn't we need to also clear existing paint below somehow (or prevent new paint below it to be added)? Otherwise the z-order could be wrong. I couldn't find any such code in this PR. Or is it the fast_strips_buffer check ensures no other command exists yet?

(If fast_strips_buffer contains all existing paints below, it might be useful we just clear it here as oppose to prevent this optimization from happening?)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah we might be able to add such an optimization, I'd have to think about it a bit more though, so would prefer leaving this for a follow-up. But yes, in this case checking fast_strips_buffer is empty and self.wide.has_coarse_batches() is enough to determine that nothing has been drawn so far, so for now this should be safe.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’d be interesting to do a spike in the future and see what that kind of optimisation would look like, and to have a look at the numbers. Overall though, we already have similar optimisations in place in both areas: in coarse, we discard all commands before the fill that cover the entire wide tile, while in the fast path we rely on front-to-back depth buffer rendering to discard draws behind existing geometry. That said, as I mentioned, it’d still be interesting to see whether the background setup between calls ends up clearing previously pushed commands.

return;
}

if self.try_set_background_from_rect(rect) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on why you abandoned the idea of having a separate method? I see several benefits in having an explicit API: zero detection logic (the 9-condition guard in try_set_background_from_rect disappears entirely), which is important because we want fill_rect to stay as fast as possible. It also works under any constraint setting, without requiring a default_blending_only gate, and avoids fragility around conditions like “must be the first draw”, “must cover the viewport exactly”, or “must be axis-aligned”. Most importantly, the user can express intent explicitly. The renderer can then simply always read scene.background and use it as the clear color.

What we’re effectively trying to do here is detect whether the current rect fills the entire viewport area. I think we should keep this logic simple and let the consumer decide that explicitly instead of relying on heuristics.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, there are some duplicated checks between try_set_background_from_rect and try_fast_rect.

Copy link
Copy Markdown
Collaborator Author

@LaurenzV LaurenzV May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate on why you abandoned the idea of having a separate method?

This was mainly motivated by @upsuper's comment, I was a bit skeptical first but I think I agree now that it's better to not add additional API surface for this and just have this happen automatically instead.

which is important because we want fill_rect to stay as fast as possible

I very highly doubt that this will have any impact on performance. The method is inlined and in the vast majority of the cases we will already short-circuit on the first condition !self.fast_strips_buffer.commands.is_empty() (I purposefully put this check first since it's the most likely one to be true), so this should not matter at all, I think.

It also works under any constraint setting, without requiring a default_blending_only gate

Unfortunately it doesn't, I tried making it work with that but it caused issue with tests that blend into the root (because then we can't apply the background to the user surface directly), and I wasn't able to find a solution that didn't require changing a lot of code.

Most importantly, the user can express intent explicitly.

Yeah, I agree that from this perspective having the set_background API is nicer.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think set_background is better then let's stick to that. As mentioned, I'm not quite sure if it can be made to work nicely with blending though. I would have to try it. If it doesn't work, we will likely have to shelve this change until we manage to get rid of the scene constraints thing.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I’d be in favour of landing this in some form rather than shelving it. I also don’t think these two ideas contradict each other: having set_background and automatic background detection.

From a user’s perspective, I like that skia has a clear method, which is similar in many ways to ours. That method makes it obvious how to set a background colour and that previous commands are discarded. This is actually why I lean more towards having a similar clear method, because it communicates the intent more clearly, including the fact that it resets prior commands.

Moreover, if the consumer has, as I think they should for performance reasons, some data structure representing the scene as AABB hierarchies, then they should already know the background and be able to use that method explicitly. If they do not, we could still rely on background detection, which likely would not degrade performance significantly. However, those 9 conditions do make the code more fragile.

Because of that, I’d recommend splitting this work into two parts:

  1. (this PR) Introduce a clear method, which for now we only allow as the first command.
  2. (in the future) Add background detection separately, and later support clear in arbitrary positions with proper propagation of the background into the wide-tile composition path.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how to deal with the fact that it doesn't work with no scene constraints? Should we just fallback to rendering the background as a viewport-sized rectangle then?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that could work — our current background optimization for tiles with opaque images should still behave correctly by reseting the tiles' commands to only background color, though probably not for colors with opacity? I’m more thinking about propagating the color when using clear to wide tiles — we could essentially just set the background color on each wide tile and clear the commands?

&& transformed_rect.x1 >= f64::from(self.width)
&& transformed_rect.y1 >= f64::from(self.height)
{
self.background = Some(PremulColor::from_alpha_color(*color));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it’d be interesting to do a spike in the future and see what that kind of optimisation would look like, and to have a look at the numbers. Overall though, we already have similar optimisations in place in both areas: in coarse, we discard all commands before the fill that cover the entire wide tile, while in the fast path we rely on front-to-back depth buffer rendering to discard draws behind existing geometry. That said, as I mentioned, it’d still be interesting to see whether the background setup between calls ends up clearing previously pushed commands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants