Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,71 @@ Optimize for performance.
- Touch only what you must. Clean up only your own mess.
- Define success criteria. Loop until verified.

## TV Performance Model (READ BEFORE OPTIMIZING)

Target devices are embedded TV SoCs running Chrome 38+. Their cost model is
different from desktop — optimize against this model, not intuition:

1. **GL calls cost CPU, not GPU.** Every WebGL call is serialized into
Chrome's GPU-process command buffer (~0.5–2µs each on TV CPUs). A screen
of 50 single-op cards × ~12 calls each ≈ 0.5–1ms/frame of pure driver
CPU. Reducing _calls_ (uniforms, buffer uploads) usually beats reducing
_draw calls_ — measured: merging draw calls alone did not change FPS.
2. **`bufferData` is a driver-side CPU copy.** A per-frame upload of a few
hundred KB (e.g. the SDF text buffer) is a guaranteed CPU tax. Skip
uploads whose bytes are provably unchanged.
3. **Fill rate is the GPU wall.** Per-fragment cost = shader ALU + varying
interpolation + blending bandwidth. Varyings that are constant across a
quad are pure waste — compute them on the CPU and upload as uniforms.
Uniform-based branches are effectively free; quad-constant data through
interpolators is not.
4. **To find the bottleneck:** render at half-resolution backbuffer — if FPS
jumps, you're fill-rate bound (attack shaders/overdraw); if not, trace
CPU time in `drawFrame` vs the GPU process (attack GL call count and
JS update work).

### Caching layers and their invariants (do not break)

- **Shader programs** are cached per shader key (`shCache`) — one compile
ever. Shader _nodes_ are NOT cached: each `createShader()` call allocates
(~10 defineProperty closures). Reuse shader nodes across remounts.
- **Uniform values are a function of `resolvedProps` + node `w`/`h` and
NOTHING else.** This is the shader value-key cache contract
(`CoreShaderNode.createValueKey`). Never make a shader's `update()` read
position, time, or any other state — caching at three layers assumes this.
- **Uniform collections are immutable after fill and shared by reference**
across shader nodes with equal value keys. Reference equality implies
value equality — `WebGlShaderProgram.bindRenderOp` skips re-uploads on
this basis. Never mutate a collection after `update()` fills it.
- **GL uniform values are per-program state** that persists across
`useProgram` switches. `bindRenderOp` is the only writer and keeps shadow
copies — if you add another uniform writer, update the shadows or you
will create stale-skip bugs.
- **`UpdateType.RecalcUniforms` fires only on dimension changes** (w/h
setters, Autosizer, text layout) and shader assignment — never on pure
translation. If you add a new `props.w`/`props.h` writer, raise the flag
there too.
- **The SDF buffer upload is skipped when content is unchanged**
(`sdfBufferChanged` + size match). The cache-hit path
(`addSdfCachedQuads`) must keep writing byte-identical data at identical
offsets; any new SDF write path or reorder source must set
`sdfBufferChanged = true`. Conservative direction: when in doubt, force
the upload — a redundant upload is correct, a wrong skip is a glitch.
- **SDF vertex caches are world-space**: they hit only while a text node's
transform is static. Layout caches hit regardless of position.

### Rules for new optimizations

- Derive dirty signals **inside the renderer from which code path ran**
when possible (cheap, no invalidation matrix) instead of scene-graph
hooks (every writer must be found, and a missed one is a heisenbug).
- Every skip must fail conservative: uncertainty → do the work.
- Per-quad-constant values belong in uniforms; per-quad-varying values
belong in vertex attributes; never ship constants through varyings.
- Know the scroll path: rows translate under a static clipping viewport.
Translation must stay on fast paths — anything added per-`Global`-update
runs for every node of every scroll frame.

### Architecture Principles

- **Class-based design** - Use TypeScript classes for structure and type safety
Expand Down
4 changes: 3 additions & 1 deletion examples/common/Character.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,14 @@ export class Character {
const nextFrame = () => {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
this.node.texture = this.rightFrames[curI]!;
this.node.textureOptions.flipX = flipX;
curI++;
if (curI > iEnd) {
curI = iStart;
}
};
// textureOptions must be replaced wholesale, never mutated in place —
// nodes created without options share a frozen default object.
this.node.textureOptions = { flipX };
nextFrame();
this.curIntervalAnimation = setInterval(nextFrame, intervalMs);
}
Expand Down
5 changes: 4 additions & 1 deletion examples/tests/texture-cleanup-critical.ts
Original file line number Diff line number Diff line change
Expand Up @@ -51,13 +51,16 @@ See docs/ManualRegressionTests.md for more information.
fontSize: 40,
});

// textureOptions must be replaced wholesale, never mutated in place —
// nodes created without options share a frozen default object.
screen.textureOptions = { preload: true };

// Create a new random texture every 10ms
setInterval(() => {
screen.texture = renderer.createTexture('NoiseTexture', {
w: 500,
h: 500,
cacheId: Math.floor(Math.random() * 100000),
});
screen.textureOptions.preload = true;
}, 100);
}
24 changes: 23 additions & 1 deletion visual-regression/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ let snapshotsTested = 0;
let snapshotsPassed = 0;
let snapshotsFailed = 0;
let snapshotsSkipped = 0;
const pageErrors: string[] = [];

/**
* The runtime environment (local, ci, etc.)
Expand Down Expand Up @@ -387,6 +388,18 @@ async function runTest(
);
}

// Fail the run on uncaught page errors. Pixels can still match while an
// example throws every frame (e.g. a frozen textureOptions mutation), so
// a green snapshot alone is not enough. Dedupe because rAF-driven errors
// repeat identically each frame.
page.on('pageerror', (err) => {
const message = `worker[${shardIndex}] (${renderMode}): ${err.message}`;
if (pageErrors.indexOf(message) === -1) {
pageErrors.push(message);
console.log(chalk.red.bold(`PAGE ERROR! ${message}`));
}
});

await page.exposeFunction('snapshot', makeSnapshotHandler(page));

const donePromise = new Promise<void>((resolve) => {
Expand Down Expand Up @@ -462,7 +475,16 @@ async function runTest(
console.log(chalk.gray(` ${snapshotsTested} snapshots tested`));
}

if (pageErrors.length > 0) {
console.log(
chalk.red(` ${pageErrors.length} uncaught page error(s) detected:`),
);
for (let i = 0; i < pageErrors.length; i++) {
console.log(chalk.red(` ${pageErrors[i]}`));
}
}

console.log(chalk.reset(''));

return snapshotsFailed > 0 ? 1 : 0;
return snapshotsFailed > 0 || pageErrors.length > 0 ? 1 : 0;
}
Loading