perf(points): shrink VSOut + per-instance buffer#177
Merged
Conversation
Three independent VSOut cleanups in the same file pair: - sizePx (location 13): drop. The fragment used it only to compute the procedural-disk crossfade alpha multiplier. All inputs are per-instance constants, so the smoothstep moves to the vertex stage and folds into out.intensity. Fragment loses a smoothstep + saturate. - isFallback (location 7): drop. Used by realOnlyMode discard and by the magenta highlight tint. realOnlyMode now culls at the vertex stage (same trick as Malmquist mode 1), and the magenta multiplier bakes into out.tint. Fragment loses a per-pixel discard branch and a select. Side benefit: realOnly-gated galaxies are now also non-pickable, which fixes a pre-existing inconsistency where they were invisible but the pick fragment still wrote their identity. - paCs + paSn (locations 6, 15): pack into one vec2<f32> paRotation at location 6. Same wire bytes, frees location 15. Net: VSOut 9 locations -> 6, 64 B -> 56 B. Fragment loses ~5 ALU ops per pixel; vertex picks up cheap per-instance work. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
skymap | 85ba2eb | Commit Preview URL Branch Preview URL |
May 20 2026, 03:06 AM |
kPerZ is the K-correction coefficient — a single linear factor per survey (SDSS=3.0, GLADE=1.0, 2MRS=0.0, Famous=0.0, Synthetic=3.0) baked into every row of the per-instance vertex buffer. Per-row storage paid 2.5M copies of the same handful of constants. Move kPerZ into SourceUniforms (the existing @group(2) per-survey uniform that already carried sourceCode + 12 B padding). Free pad slot at offset 4 absorbs the f32; no buffer-size or alignment churn. The vertex shader reads source.kPerZ instead of p.kPerZ. Verified consumer graph: kPerZ as a value is only consumed by the points pipeline. pickColourIndex's secondary caller (proceduralDiskSubsystem) already discards the kPerZ field — only the bake site used it, and that write now goes away. Sentinel-colour rows (colorIndex >= 100) previously wrote kPerZ = 0; the shader's select gates the K-correction off via the sentinel check, so the per-row value never mattered. After the move, all rows use the survey constant; sentinel rows still get the 1.05 substitution. Net: - Vertex buffer: 12 slots -> 11, 48 B -> 44 B per instance. At 2.5M galaxies that's ~10 MB saved on the GPU. - One fewer per-vertex attribute fetch. - Slot indices for axisRatio (6 -> 5), positionAngleDeg (7 -> 6), diameterKpc (8 -> 7), vMaxWeight (9 -> 8), schechterRatio (10 -> 9), angularDensityWeight (11 -> 10). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two more VSOut tightenings: - Pre-compute the elliptical-mask coefficient (safeAB) at the vertex stage and pack into the unused alpha channel of tint. Fragment reads in.tint.w directly with zero per-pixel axis-ratio work — saves a select + max + sign-check per pixel. The axisRatio location goes away entirely. - Renumber paRotation from location 6 to location 4 so the VSOut locations are contiguous (0..4 with no gaps). Net: VSOut 6 locations -> 5, 56 B -> 52 B. Fragment shader keeps shrinking. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drop the standalone intensity varying by pre-multiplying it into the rgb channels of the per-instance colour vec4 at the vertex stage. Fragment reads in.shaded.rgb directly with no per-pixel mul. Renamed tint -> shaded since the field no longer carries a 'tint' (modifier) but a fully-lit RGB premultiplied with intensity, plus the safeAB ellipse-mask coefficient packed into .w (unchanged by this commit). The invisibility cull now reads the local intensity scalar; behaviour unchanged. Net: VSOut 5 locations -> 4, 52 B -> 48 B. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The same galaxy was rendering different hues in the points pass vs the
procedural-disk impostor pass — two divergences:
1. Points applied K-correction in the vertex shader; procDisks didn't
apply it at all. At non-trivial redshift, hues drifted apart.
2. The unknown-band fallback was 1.05 in the points shader and 1.0 in
the procDisk subsystem. Different ramp positions = different hue.
Move K-correction into pickColourIndex() so both consumers get the same
rest-frame value with the shared UNKNOWN_COLOUR_RAMP_POSITION fallback.
The shader drops its K-correction block (HUBBLE_DISTANCE_MPC + zRedshift
+ sentinel check + select) entirely.
Function signature collapses from { colourIndex, kPerZ } | null to
number. Neither caller distinguished null from "got data" — they both
substituted the same fallback — so the nullable was paying for an
option nobody exercised. Both call sites now read identically:
const colourIndex = pickColourIndex(source, magU..magZ, dMpc);
Side effects:
- SourceUniforms.kPerZ slot reverts to padding (no longer read by GPU).
- pointRenderer.ts drops the per-survey kPerZ write.
- NO_COLOUR_SENTINEL constant goes away (1.05 is baked directly).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The 4× thumbnail-footprint padding + 30-kpc synthetic-fallback floor + kpc->Mpc unit conversion was inlined at three sites: - buildPointInterleavedBuffer (points bake, was kpc — shader converted) - proceduralDiskSubsystem (full-extent in Mpc) - texturedImpostorSubsystem (full-extent in Mpc) A change to any of those constants had to land in all three in lockstep. Centralise into src/utils/galaxySize.ts as paddedRadiusMpc(diameterKpc). The two subsystems multiply by 2 at the call site for their full-quad- extent convention (vertex shader halves at corner expansion); the points bake uses the helper output directly as half-extent. While in the neighbourhood, switch the points pipeline to Mpc units to match every other shader: - Vertex buffer slot 7 was raw diameterKpc; shader applied '* 2 / 1000' to convert. Now pre-baked as padded radius in Mpc. - PerVertex field renamed diameterKpc -> radiusMpc. - Shader drops the safeDiameterKpc select + GALAXY_RADIUS_MPC compute and reads p.radiusMpc directly. Raw cloud.diameterKpc (the catalog's source-of-truth in kpc) is unchanged — only the GPU interleaved buffer's slot semantics shifted. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The textured-galaxy-thumbnail pipeline had inconsistent names along its chain: shaders/disks/ (folder), texturedDiskRenderer.ts (GPU consumer), texturedImpostorSubsystem.ts (engine driver). 'Impostor' is legitimate graphics jargon for a billboard-as-3D-approximation, but the three-way naming mismatch obscured the relationship between the layers. Rename: - shaders/disks/ -> shaders/texturedDisks/ Parallels the existing shaders/proceduralDisks/ sibling. - texturedImpostorSubsystem -> texturedDiskSubsystem Aligns with texturedDiskRenderer.ts and texturedDisks/ shaders. All identifiers (PascalCase + camelCase + plural field name) renamed across 32 files. WESL imports updated. Stale 'disks.wesl' cross- references in lib/* shader comments cleaned up. No behaviour change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sweep of the leftovers the sed didn't catch (compound terms like
'textured-impostor' separator-style, plus historical narrative
comments referring to the pre-rename layout). Generic uses of
'impostor' as a graphics term (e.g. proceduralDisks docblock
describing what texturedDisks IS) are left intact — those are
legitimate jargon, not subsystem references.
While in the neighbourhood, trim a handful of historical comments
('post-split', 'Task 11/12', 'legacy textured-impostors slot',
'2026-05-18 quad-removal') per the project's comment-style
convention against history notes in code.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two passes of points-pipeline cleanup, riding on the selection-ring extraction that just landed.
VSOut shrinks (varying bandwidth)
VSOut: 9 locations → 6, 64 B → 56 B. Fragment loses ~5 ALU ops per pixel.
Per-instance buffer shrink
PerVertex: 12 slots → 11, 48 B → 44 B per instance. ~10 MB saved on the GPU at the large tier.
Side benefit
realOnly-gated galaxies are now also non-pickable. Previously they were invisible but the pick fragment still wrote their identity — a pre-existing inconsistency.
Test plan
🤖 Generated with Claude Code