Skip to content

docs(lambda): add docs/deploy/aws-lambda.mdx deployment guide#914

Merged
jrusso1020 merged 2 commits into
mainfrom
docs-lambda-deploy
May 17, 2026
Merged

docs(lambda): add docs/deploy/aws-lambda.mdx deployment guide#914
jrusso1020 merged 2 commits into
mainfrom
docs-lambda-deploy

Conversation

@jrusso1020
Copy link
Copy Markdown
Collaborator

@jrusso1020 jrusso1020 commented May 17, 2026

What

Adds docs/deploy/aws-lambda.mdx — the end-to-end deployment guide for the new AWS Lambda surface. Registered under a new "Deploy" group in the Mintlify nav (docs/docs.json).

Why

Per DISTRIBUTED-RENDERING-PLAN.md § 11 Phase 6b PR 6.7: adopters landing on the docs site need a single page that takes them from "I have AWS credentials" to "I have a rendered video in S3" without having to read the SAM template or the SDK source. The page collects everything the implementation PRs in this stack added.

How

Covers:

  • Architecture diagram (Plan → Map(N) → Assemble + the single Lambda function dispatching by Action, pulled from DISTRIBUTED-RENDERING-PLAN.md § 15.2).
  • Prerequisites table (AWS credentials, SAM CLI, bun, repo checkout).
  • Three deployment paths: hyperframes lambda CLI (recommended), direct sam deploy against examples/aws-lambda/template.yaml, and the HyperframesRenderStack CDK construct.
  • IAM bootstrap section pointing at hyperframes lambda policies user|role|validate.
  • Cost shape — Lambda GB-seconds + SFN transitions → the displayCost the progress verb prints. Notes that it's best-effort and S3 transfer is excluded.
  • Troubleshooting with the typed error names operators actually hit (PLAN_HASH_MISMATCH, BROWSER_GPU_NOT_SOFTWARE, the iam:CreateRole denial, stuck RUNNING, the Retain bucket semantics).
  • What's NOT in v1 callout so adopters don't burn time looking for webhooks / compositions verb / HDR support.

No code changes.

Stacks on #909, #910, #912, and #913.

🤖 Generated with Claude Code

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI green (regression, player-perf, preview-regression all pass; Graphite pending is expected for a stacked PR).

Content looks accurate. A few specific things I verified:

Architecture diagram — the dispatch model (Plan/RenderChunk/Assemble → one Lambda handler → S3) matches the handler.mjs structure described elsewhere in the stack.

Three deployment paths — CLI → SAM → CDK is a clean progression; the CDK construct exposing .bucket, .renderFunction, .stateMachine is consistent with the described CloudFormation outputs (RenderBucketName, RenderStateMachineArn, RenderFunctionArn).

IAM sectionpolicies user / policies role / policies validate subcommands are well-documented with the Resource: "*" narrowing note; the CI gate pattern for validate is a good call.

Cost accounting — the $0.0214 example and the pointer to costAccounting.ts for auditability is correct in principle. One minor nit: the cost line says "S3 transfer is not included" but doesn't mention S3 GET/PUT request costs either — worth a one-liner so adopters don't expect the number to be complete.

Troubleshooting sectionPLAN_HASH_MISMATCH, BROWSER_GPU_NOT_SOFTWARE, stuck-at-RUNNING, and S3 Retain bucket are all realistic failure modes with actionable guidance. FONT_FETCH_FAILED / FFMPEG_VERSION_MISMATCH are mentioned in the stuck-render entry but not given their own entries — fine for v1 docs.

"What's NOT in v1" section — useful, explicitly limits scope. The reference to "PR 6.10 on the plan" for compositions discovery is slightly internal; readers won't know what that means. Consider replacing with "in a future release" or linking to a tracking issue.

No broken links spotted. [CLI reference](/packages/cli#hyperframes-lambda) assumes that anchor exists on the CLI page — make sure the CLI PR in the stack adds it.

Approved.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-line summary: docs page is well-structured and mostly accurate, but the BROWSER_GPU_NOT_SOFTWARE troubleshooting entry points users at a non-existent data-gpu-mode composition attribute — that's a blocker on a docs PR.

Additive review@miguel-heygen already covered the S3 request-cost line, the internal "PR 6.10" reference, and the [CLI reference] anchor concern. I won't repeat those. The findings below are gaps I didn't see in Miguel's review.

Strengths

  • docs/deploy/aws-lambda.mdx:138-152 — the IAM bootstrap section is genuinely strong: it walks through policies user|role|validate, notes the Resource: "*" narrowing path, and explicitly recommends policies validate as a CI pre-deploy step. Matches the source intent in packages/cli/src/commands/lambda/policies.ts:1-22.
  • docs/deploy/aws-lambda.mdx:155-167 — cost example output ($0.0214 (Lambda $0.0210 + SFN $0.0004)) matches the actual progress output formatter in packages/cli/src/commands/lambda/progress.ts:46-48. Concrete and verifiable.
  • The "What's NOT in v1 surface" section at the bottom is the right shape — adopters waste hours looking for missing webhooks/HDR without a callout like this.

Findings

blockerdocs/deploy/aws-lambda.mdx:173 (Troubleshooting: BROWSER_GPU_NOT_SOFTWARE). The doc tells users:

The compiled composition reads data-gpu-mode="hardware" (or similar). [...] Change the composition's data-gpu-mode or omit it (the default is software).

I grepped the entire repo at the PR head: there is no data-gpu-mode attribute handling anywhere in packages/engine, packages/producer, or packages/aws-lambda. The only hits are this doc line and an unrelated gpuModes array in packages/cli/src/commands/render.ts:422 (local dev-render output, not distributed). The actual error source is packages/engine/src/utils/assertSwiftShader.ts:107-122: it reads chrome://gpu after launch and throws if the GL backend isn't SwiftShader. Its own thrown message says:

"Ensure Chrome was launched with --use-gl=swiftshader --use-angle=swiftshader and that the SwiftShader libraries are present in the runtime image."

i.e. the failure is a Lambda runtime-image / launch-flags problem, NOT a composition attribute. An adopter who hits this error and follows the doc's advice will edit a non-existent attribute on their composition and the error will persist. Worse than no advice. Replace this entry with the actual root cause (Chrome launch flags / SwiftShader libs in the handler ZIP) and the actual remediation (rebuild the ZIP with bun run --cwd packages/aws-lambda build:zip and re-deploy, since lambda deploy rebuilds the ZIP that bundles @sparticuz/chromium).

important — coverage gap: hyperframes lambda sites create is not mentioned anywhere in the doc. The CLI's own HELP at packages/cli/src/commands/lambda.ts:18-21 and its examples array call it out as a first-class workflow ("Pre-upload a project so multiple renders share the upload"), and the render subcommand explicitly supports a --site-id flag that consumes its output (packages/cli/src/commands/lambda/render.ts:51-60). For a page titled "Three deployment paths" that's supposed to take adopters from credentials to rendered MP4, omitting the sites workflow leaves users on Path 1 re-tarring + re-uploading the same project on every render — exactly the cost shape the page elsewhere tries to avoid. Add a sites create subsection (Path 1.5 or a "Re-using uploads" callout under Path 1).

important — SAM-path concurrency default mismatch. The doc's framing under Path 1 (docs/deploy/aws-lambda.mdx:62-67) explains why --concurrency=8 is a conservative default that bounds runaway spend, and the Path 2 SAM example happens to pass ReservedConcurrency=8. But the SAM template's own default is -1 (unreserved) — see examples/aws-lambda/template.yaml:36-42. A reader who simplifies the Path 2 example by dropping --parameter-overrides is silently switched from "conservative 8-cap" to "account-default unreserved." Worth one extra line in the Path 2 section: "Drop ReservedConcurrency from --parameter-overrides at your own risk — the template's own default is -1 (unreserved)." Same warning shape as the Path 1 paragraph.

nitdocs/deploy/aws-lambda.mdx:30 ("HyperFrames repo checkout"). Says lambda deploy builds the ZIP from source, and adopters who deploy outside a checkout can set HYPERFRAMES_REPO_ROOT. Verified accurate (packages/cli/src/commands/lambda/repoRoot.ts:15-30). But the env var is undocumented anywhere outside this single table row — worth a one-liner in the env-var reference (if one gets added later), or at least a fuller example here showing the directory structure it expects ($HYPERFRAMES_REPO_ROOT/packages/aws-lambda/package.json must exist).

nitdocs/deploy/aws-lambda.mdx:177 (stuck-at-RUNNING entry) lists FONT_FETCH_FAILED and FFMPEG_VERSION_MISMATCH as examples of typed errors the SFN console surfaces. Verified those names exist in packages/aws-lambda/src/cdk/HyperframesRenderStack.ts:193-207 (alarm dimensions). Miguel suggested giving each its own troubleshooting entry; I'll second that as a low-priority follow-up since these are the most common production failure modes after PLAN_HASH_MISMATCH.

nitdocs/deploy/aws-lambda.mdx:55-58 deploy example doesn't pass --profile, but the CLI documents it (packages/cli/src/commands/lambda.ts:74). For users on multi-account setups, a one-liner mentioning the flag (or the AWS_PROFILE env var fallback that deploy.ts:42 reads) would head off a class of "wrong-account deploy" pitfalls.

Verdict

Verdict: REQUEST CHANGES
Reasoning: the BROWSER_GPU_NOT_SOFTWARE entry actively misleads — it tells adopters to edit a composition attribute that doesn't exist, instead of the real runtime-image fix. That's a blocker on a docs page where the troubleshooting section is the load-bearing reason users land there. Everything else is fixable or punt-able. Fix the GPU entry, optionally add a sites create subsection, and this is good to ship.

Review by Vai

@jrusso1020 jrusso1020 force-pushed the feat-lambda-local-harness branch from c0895ef to 6faac80 Compare May 17, 2026 00:31
@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from 9c7e205 to 8d6ffe5 Compare May 17, 2026 00:31
@jrusso1020 jrusso1020 force-pushed the feat-lambda-local-harness branch from 6faac80 to 15289f3 Compare May 17, 2026 00:51
@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from 8d6ffe5 to 0ded0c0 Compare May 17, 2026 00:51
miguel-heygen
miguel-heygen previously approved these changes May 17, 2026
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker from the previous review is addressed:

Troubleshooting entry referenced non-existent data-gpu-mode attribute — Fixed. The bogus data-gpu-mode attribute reference is gone. The troubleshooting section now correctly documents the BROWSER_GPU_NOT_SOFTWARE error with the real fix: rebuild the handler ZIP and redeploy. The explanation correctly identifies that the issue is at the runtime-image / launch-flags layer (SwiftShader via --use-gl=swiftshader --use-angle=swiftshader), not at the composition layer, and that lambda deploy always rebuilds the ZIP so a redeploy resolves it.

vanceingalls
vanceingalls previously approved these changes May 17, 2026
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of 0ded0c07 against my prior REQUEST CHANGES at 4304554554.

Resolution status

  • Blocker — BROWSER_GPU_NOT_SOFTWARE pointed at non-existent data-gpu-mode: resolved. Grepped HEAD (docs/, packages/, examples/) — zero hits for data-gpu-mode. New entry at docs/deploy/aws-lambda.mdx:189-198 now correctly attributes the failure to the runtime-image / launch-flags layer and tells adopters to rebuild via bun run --cwd packages/aws-lambda build:zip (verified script exists in packages/aws-lambda/package.json) and redeploy. The Chrome flag pair cited (--use-gl=swiftshader --use-angle=swiftshader) matches what assertSwiftShader.ts:121 says is required. Advice now leads to the actual fix.
  • Important — missing sites create workflow: resolved. New "Pre-staging a project with sites create" subsection at docs/deploy/aws-lambda.mdx:76-88 documents the workflow, the --site-id consumer, and the content-addressing semantics. The SHA-256 + HeadObject short-circuit claim is grounded in packages/aws-lambda/src/sdk/deploySite.ts:114-126.
  • Important — SAM ReservedConcurrency default -1 mismatch: resolved. Warning callout at docs/deploy/aws-lambda.mdx:113-115 correctly states the SAM template's own default is -1 (unreserved) and warns about silently dropping the override. Matches examples/aws-lambda/template.yaml:40-42.
  • Nits (HYPERFRAMES_REPO_ROOT depth, --profile / AWS_PROFILE): not addressed. These were optional and remain optional — author's call.

Scope check

Diff between 149555f...0ded0c0 touches one file (docs/deploy/aws-lambda.mdx). No scope creep.

CI

mergeStateStatus=UNSTABLE is failing optional checks only — check_runs shows no failure conclusions on the head SHA. Per Rule 5, this is mention-not-block.

Verdict

Verdict: APPROVE
Reasoning: the blocker is fixed at the root (advice now points to the real runtime-image / Chrome-flags fix instead of a phantom composition attribute), both important items are addressed with technically accurate framing, and nothing else regressed. Nits are author's call.

Review by Vai

@jrusso1020 jrusso1020 changed the base branch from feat-lambda-local-harness to main May 17, 2026 07:02
@jrusso1020 jrusso1020 dismissed stale reviews from vanceingalls and miguel-heygen May 17, 2026 07:02

The base branch was changed.

@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from 0ded0c0 to a6f848e Compare May 17, 2026 07:06
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve after rebase. Diff verified unchanged — no data-gpu-mode references.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve after rebase onto main. Force-push dismissed my prior --approve (require_last_push_approval: true) — content unchanged, same commits replayed on the new base. All findings from the prior review's resolution still apply.

Re-review by Vai (post-rebase re-stamp)

@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from a6f848e to 408dd28 Compare May 17, 2026 07:31
Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve on 408dd28. Same content.

Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve after rebase + #910 smoke fix.

#910 adds the CLI smoke fix on top: @hyperframes/aws-lambda moved to devDependencies, dispatcher dynamic-imports @hyperframes/aws-lambda/sdk (lambda.ts:150) with a friendly ERR_MODULE_NOT_FOUNDnpm install handler at :152-158. npm pack / npm install now works because there's no workspace:* protocol in published dependencies. Clean fix.

#912/#913/#914/#915 are pure rebases on top — same commits replayed on the new base, content unchanged vs. the last approved round. Findings from the prior review's resolution still apply.

Re-review by Vai (post-smoke-fix re-stamp)

@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from 408dd28 to 24ae310 Compare May 17, 2026 17:08
Copy link
Copy Markdown
Collaborator

@vanceingalls vanceingalls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve after force-push that cherry-picked each PR's unique commits onto post-#910 main. Content identical to the prior approved round (per James's note). Prior findings and verifications still hold.

Re-review by Vai (final post-#910-merge re-stamp)

Copy link
Copy Markdown
Collaborator

@miguel-heygen miguel-heygen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approve on 24ae310. Content-identical.

End-to-end deploy guide for the AWS Lambda surface. Covers:

  - Architecture diagram (Step Functions Plan → Map(N) → Assemble +
    the single Lambda function dispatching by Action; pulled from
    the distributed rendering plan §15.2).
  - Prerequisites table (AWS creds, SAM CLI, bun, repo checkout).
  - Three deployment paths: hyperframes lambda CLI (recommended),
    direct sam deploy against examples/aws-lambda/template.yaml,
    and HyperframesRenderStack CDK construct.
  - IAM bootstrap via hyperframes lambda policies user/role/validate.
  - Cost shape — how Lambda GB-seconds + SFN transitions roll up
    into the displayCost the progress verb prints.
  - Troubleshooting block with the typed error names operators
    actually hit (PLAN_HASH_MISMATCH, BROWSER_GPU_NOT_SOFTWARE,
    iam:CreateRole denial, stuck RUNNING, S3 Retain semantics).
  - "What's NOT in v1" callout so adopters don't burn time looking
    for webhooks / compositions verb / HDR support.

Registered under a new "Deploy" group in docs.json's Documentation
tab, sitting after Packages so the conceptual flow is "what you
can build" → "how to ship it."

No code changes.
One blocker + two important items from Vai's review:

  - The BROWSER_GPU_NOT_SOFTWARE troubleshooting entry pointed
    adopters at a non-existent `data-gpu-mode` composition attribute.
    Replaced with the actual root cause (Chrome launch flags +
    @sparticuz/chromium libs in the handler ZIP) and the actual
    remediation: rebuild + redeploy via `lambda deploy` (which
    always rebuilds the ZIP). The composition-attribute story
    would have sent users editing the wrong file entirely.

  - Added a `sites create` subsection under Path 1 so adopters
    running tight inner loops know how to reuse a project upload
    across many renders instead of re-tarring + re-uploading on
    each call. The CLI surface was first-class but the doc had
    been silent.

  - Added a Warning callout under Path 2 explaining that the SAM
    template's own ReservedConcurrency default is `-1` (unreserved)
    — a reader simplifying the Path 2 example by dropping the
    --parameter-overrides flag would silently switch to unreserved
    concurrency and pay the runaway-Map cost. The warning mirrors
    the cost-shape callout earlier in the page.
@jrusso1020 jrusso1020 force-pushed the docs-lambda-deploy branch from 24ae310 to 4723c7d Compare May 17, 2026 18:05
@jrusso1020 jrusso1020 merged commit a1a8c77 into main May 17, 2026
35 checks passed
@jrusso1020 jrusso1020 deleted the docs-lambda-deploy branch May 17, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants