Skip to content

[STG-1689] fix: improve a11y snapshot efficiency#2050

Open
shrey150 wants to merge 3 commits intomainfrom
stg-1689-a11y-snapshot
Open

[STG-1689] fix: improve a11y snapshot efficiency#2050
shrey150 wants to merge 3 commits intomainfrom
stg-1689-a11y-snapshot

Conversation

@shrey150
Copy link
Copy Markdown
Contributor

@shrey150 shrey150 commented Apr 25, 2026

Summary

  • Add an interactive accessibility snapshot mode that keeps actionable controls plus useful landmarks while dropping static prose-heavy nodes.
  • Use interactive snapshots by default for the agent ariaTree tool, with mode: "full" still available for content-heavy reads.
  • Add browse CLI flags for snapshot --interactive, --depth, and --selector.
  • Filter returned snapshot maps to refs that still appear in the rendered tree.
  • Include native disclosure controls (<summary> / DisclosureTriangle) after smoke testing found Stagehand was missing them.

Linear: https://linear.app/browserbase/issue/STG-1689/improve-a11y-tree-representation-for-browse-cli

Smoke comparison

Synthetic page with 80 repeated static report cards:

Tool / mode Chars Est. tokens Lines
Stagehand full 46,087 11,522 748
Stagehand interactive 7,155 1,789 180
agent-browser interactive 6,197 1,550 171
browser-use state 5,515 1,379 119

Broader matrix after adding disclosure support:

Page Stagehand interactive agent-browser interactive browser-use state Accuracy notes
Controls fixture 150 tokens 122 tokens 348 tokens Stagehand 14/14 expected controls; agent-browser 13/14 missed <summary>; browser-use 14/14 but included static prose
Long cards fixture 1,223 tokens 1,112 tokens 948 tokens Stagehand/agent-browser 6/6 expected controls incl. offscreen report 60; browser-use 4/6 due viewport/windowing and included static prose
Article fixture 32 tokens 16 tokens 980 tokens Stagehand/agent-browser kept only Primary source + Subscribe; browser-use preserved article prose
Hacker News 1,836 tokens 2,010 tokens 3,247 tokens Stagehand roughly same/smaller than agent-browser, much smaller than browser-use state
Wikipedia home 1,150 tokens 986 tokens 690 tokens Browser-use is smaller but exposes fewer refs (44 vs Stagehand 111 / agent-browser 110)

compact is an output-size flag, not a semantic filter. It preserves internal ref maps for follow-up actions: snapshot -c -i on the controls fixture produced a clickable Save settings ref and click @0-20 updated page state to saved.

Compact stdout reductions from the same fixture set:

Page/mode JSON stdout Compact stdout Reduction
Long full 70,404 chars 26,946 chars 2.6x
Long interactive 18,305 chars 4,890 chars 3.7x
Controls interactive 1,636 chars 597 chars 2.7x
Article interactive 383 chars 128 chars 3.0x

Validation

  • pnpm --filter @browserbasehq/stagehand test:core -- packages/core/dist/esm/tests/unit/snapshot-a11y-tree-utils.test.js packages/core/dist/esm/tests/unit/page-snapshot.test.js packages/core/dist/esm/tests/unit/snapshot-tree-format-utils.test.js
  • pnpm --filter @browserbasehq/stagehand lint
  • pnpm exec prettier --check packages/cli && pnpm --filter @browserbasehq/browse-cli eslint && pnpm --filter @browserbasehq/browse-cli typecheck
  • pnpm --filter @browserbasehq/browse-cli test
  • pnpm --filter @browserbasehq/stagehand build && pnpm --filter @browserbasehq/browse-cli build
  • Live CLI matrix comparing Stagehand, agent-browser, and browser-use across controlled fixtures, Example.com, Hacker News, and Wikipedia

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 25, 2026

🦋 Changeset detected

Latest commit: 946abb3

The changes in this PR will be included in the next version bump.

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 12 files

Confidence score: 3/5

  • There is moderate merge risk because packages/cli/src/index.ts has a concrete user-facing policy gap: CLI validation throws a generic new Error() instead of a typed/sanitized error, which can lead to inconsistent handling or unsanitized messaging.
  • packages/core/lib/v3/agent/tools/ariaTree.ts has a cleanup issue where the setTimeout used in Promise.race is not cleared when work finishes early, which can leak timers and create avoidable resource overhead over time.
  • These issues look fixable without major redesign, but they are specific enough (severity 6/10 and 5/10 with high confidence) to justify addressing before merge for safer behavior.
  • Pay close attention to packages/cli/src/index.ts, packages/core/lib/v3/agent/tools/ariaTree.ts - error sanitization compliance and timeout cleanup are the key risk areas.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/cli/src/index.ts">

<violation number="1" location="packages/cli/src/index.ts:2703">
P2: Custom agent: **Exception and error message sanitization**

User-facing CLI validation throws generic `new Error()` instead of a typed/sanitized error class required by policy.</violation>
</file>

<file name="packages/core/lib/v3/agent/tools/ariaTree.ts">

<violation number="1" location="packages/core/lib/v3/agent/tools/ariaTree.ts:36">
P2: The `setTimeout` in the `Promise.race` is never cleared when the snapshot resolves first, leaking timers. Capture the timer ID and clear it after the race settles.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/cli/src/index.ts Outdated
Comment thread packages/core/lib/v3/agent/tools/ariaTree.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant