vello_hybrid: Avoid unnecessarily allocating Uint32Arrays#1498
Conversation
| // See also: https://wikis.khronos.org/opengl/Synchronization | ||
| // >> There are several OpenGL functions that can pull data directly from client-side memory, | ||
| // >> or push data directly into client-side memory. Functions like `glTexSubImage2D`, | ||
| // >> `glReadPixels`, `glBufferSubData` and so forth. | ||
| // | ||
| // >> Because OpenGL is defined to be synchronous, when any of these functions have | ||
| // >> returned, they must have finished with the client memory. When `glReadPixels` returns, | ||
| // >> the pixel data is in your client memory (unless you are reading into a buffer object). | ||
| // >> When `glBufferSubData` returns, you can immediately modify or delete whatever memory | ||
| // >> pointer you gave it, as OpenGL has already read as much as it wants. |
There was a problem hiding this comment.
I presume this applies to WebGL too, then, right?
|
Would you like to have this soon or shall I wait for Alex's review as well? |
| fn upload_data_to_rgba32_texture( | ||
| gl: &WebGl2RenderingContext, | ||
| data: &[u32], | ||
| texture_width: u32, | ||
| texture_height: u32, | ||
| ) { | ||
| // Safety: This calling `Uint32Array::view` is unsafe because it provides a view into | ||
| // WASM linear memory, and any additional allocations might invalidate that view. | ||
| // In our case, this is not an issue because we only use this view once for uploading | ||
| // data to the GPU below, and no allocations happen between that. | ||
| // The `tex_image_2d` method is synchronous in the sense that once it returns, it is guaranteed | ||
| // that all necessary data has already been read, so any allocations that happen | ||
| // after this block don't affect this anymore. | ||
| // | ||
| // See also: https://wikis.khronos.org/opengl/Synchronization | ||
| // >> There are several OpenGL functions that can pull data directly from client-side memory, | ||
| // >> or push data directly into client-side memory. Functions like `glTexSubImage2D`, | ||
| // >> `glReadPixels`, `glBufferSubData` and so forth. | ||
| // | ||
| // >> Because OpenGL is defined to be synchronous, when any of these functions have | ||
| // >> returned, they must have finished with the client memory. When `glReadPixels` returns, | ||
| // >> the pixel data is in your client memory (unless you are reading into a buffer object). | ||
| // >> When `glBufferSubData` returns, you can immediately modify or delete whatever memory | ||
| // >> pointer you gave it, as OpenGL has already read as much as it wants. | ||
| let packed_array = unsafe { js_sys::Uint32Array::view(data) }; | ||
|
|
||
| gl.tex_image_2d_with_i32_and_i32_and_i32_and_format_and_type_and_opt_array_buffer_view( | ||
| WebGl2RenderingContext::TEXTURE_2D, | ||
| 0, | ||
| WebGl2RenderingContext::RGBA32UI as i32, | ||
| texture_width as i32, | ||
| texture_height as i32, | ||
| 0, | ||
| WebGl2RenderingContext::RGBA_INTEGER, | ||
| WebGl2RenderingContext::UNSIGNED_INT, | ||
| Some(&packed_array), | ||
| ) | ||
| .unwrap(); | ||
| } |
There was a problem hiding this comment.
One thought: I think maybe you can avoid the unsafe entirely by using the _opt_u8_array overload instead. As I understand it when web-sys receives a &[u8] parameter, wasm-bindgen passes it as a zero-copy Uint8Array view into wasm linear memory, same perf as the manual Uint32Array::view(). WebGL interprets the raw bytes according to the type parameter (UNSIGNED_INT), not the JS typed array type, so a Uint8Array view works correctly for RGBA32UI textures.
This would simplify the helper to:
fn upload_data_to_rgba32_texture(
gl: &WebGl2RenderingContext,
data: &[u8],
texture_width: u32,
texture_height: u32,
) {
gl.tex_image_2d_with_i32_and_i32_and_i32_and_format_and_type_and_opt_u8_array(
WebGl2RenderingContext::TEXTURE_2D,
0,
WebGl2RenderingContext::RGBA32UI as i32,
texture_width as i32,
texture_height as i32,
0,
WebGl2RenderingContext::RGBA_INTEGER,
WebGl2RenderingContext::UNSIGNED_INT,
Some(data),
)
.unwrap();
}There was a problem hiding this comment.
Oh! Lemme try this out!
There was a problem hiding this comment.
Unfortunately, that doesn't seem to work correctly due to the fact that we are using RGBA32 textures.
There was a problem hiding this comment.
Got it, thanks for checking! I think the next option might work, but I’m not sure that making two calls is better than the current approach.
fn upload_data_to_rgba32_texture(
gl: &WebGl2RenderingContext,
data: &[u8],
texture_width: u32,
texture_height: u32,
pixel_unpack_buffer: &WebGlBuffer,
) {
gl.bind_buffer(
WebGl2RenderingContext::PIXEL_UNPACK_BUFFER,
Some(pixel_unpack_buffer),
);
gl.buffer_data_with_u8_array(
WebGl2RenderingContext::PIXEL_UNPACK_BUFFER,
data,
WebGl2RenderingContext::STREAM_DRAW,
);
gl.tex_image_2d_with_i32_and_i32_and_i32_and_format_and_type_and_f64(
WebGl2RenderingContext::TEXTURE_2D,
0,
WebGl2RenderingContext::RGBA32UI as i32,
texture_width as i32,
texture_height as i32,
0,
WebGl2RenderingContext::RGBA_INTEGER,
WebGl2RenderingContext::UNSIGNED_INT,
0.0,
)
.unwrap();
gl.bind_buffer(WebGl2RenderingContext::PIXEL_UNPACK_BUFFER, None);
}There was a problem hiding this comment.
Hmm, but wouldn't this still invoke another copy to the unpack buffer first before copying it to the GPU, whch is what we've been trying to avoid?
Since you both approved I think I'll merge it in its current state, but let me know if you would like to see adjustments.
| fn upload_data_to_rgba32_texture( | ||
| gl: &WebGl2RenderingContext, | ||
| data: &[u32], | ||
| texture_width: u32, | ||
| texture_height: u32, | ||
| ) { | ||
| // Safety: This calling `Uint32Array::view` is unsafe because it provides a view into | ||
| // WASM linear memory, and any additional allocations might invalidate that view. | ||
| // In our case, this is not an issue because we only use this view once for uploading | ||
| // data to the GPU below, and no allocations happen between that. | ||
| // The `tex_image_2d` method is synchronous in the sense that once it returns, it is guaranteed | ||
| // that all necessary data has already been read, so any allocations that happen | ||
| // after this block don't affect this anymore. | ||
| // | ||
| // See also: https://wikis.khronos.org/opengl/Synchronization | ||
| // >> There are several OpenGL functions that can pull data directly from client-side memory, | ||
| // >> or push data directly into client-side memory. Functions like `glTexSubImage2D`, | ||
| // >> `glReadPixels`, `glBufferSubData` and so forth. | ||
| // | ||
| // >> Because OpenGL is defined to be synchronous, when any of these functions have | ||
| // >> returned, they must have finished with the client memory. When `glReadPixels` returns, | ||
| // >> the pixel data is in your client memory (unless you are reading into a buffer object). | ||
| // >> When `glBufferSubData` returns, you can immediately modify or delete whatever memory | ||
| // >> pointer you gave it, as OpenGL has already read as much as it wants. | ||
| let packed_array = unsafe { js_sys::Uint32Array::view(data) }; | ||
|
|
||
| gl.tex_image_2d_with_i32_and_i32_and_i32_and_format_and_type_and_opt_array_buffer_view( | ||
| WebGl2RenderingContext::TEXTURE_2D, | ||
| 0, | ||
| WebGl2RenderingContext::RGBA32UI as i32, | ||
| texture_width as i32, | ||
| texture_height as i32, | ||
| 0, | ||
| WebGl2RenderingContext::RGBA_INTEGER, | ||
| WebGl2RenderingContext::UNSIGNED_INT, | ||
| Some(&packed_array), | ||
| ) | ||
| .unwrap(); | ||
| } |
There was a problem hiding this comment.
Got it, thanks for checking! I think the next option might work, but I’m not sure that making two calls is better than the current approach.
fn upload_data_to_rgba32_texture(
gl: &WebGl2RenderingContext,
data: &[u8],
texture_width: u32,
texture_height: u32,
pixel_unpack_buffer: &WebGlBuffer,
) {
gl.bind_buffer(
WebGl2RenderingContext::PIXEL_UNPACK_BUFFER,
Some(pixel_unpack_buffer),
);
gl.buffer_data_with_u8_array(
WebGl2RenderingContext::PIXEL_UNPACK_BUFFER,
data,
WebGl2RenderingContext::STREAM_DRAW,
);
gl.tex_image_2d_with_i32_and_i32_and_i32_and_format_and_type_and_f64(
WebGl2RenderingContext::TEXTURE_2D,
0,
WebGl2RenderingContext::RGBA32UI as i32,
texture_width as i32,
texture_height as i32,
0,
WebGl2RenderingContext::RGBA_INTEGER,
WebGl2RenderingContext::UNSIGNED_INT,
0.0,
)
.unwrap();
gl.bind_buffer(WebGl2RenderingContext::PIXEL_UNPACK_BUFFER, None);
}
For the GhostScript tiger scene:
Before:

texImage2Das well asUInt32Array::copy_from_sliceare visible in the profile.After: Only

texImage2Dis visibile in the profile.