feat(screenshot): add CLI options to cap screenshot size at the source#1823
feat(screenshot): add CLI options to cap screenshot size at the source#1823antoinekm wants to merge 1 commit into
Conversation
90b3282 to
f09e1ea
Compare
|
@antoinekm could you please rebase the PR? |
| 'If set to true takes a screenshot of the full page instead of the currently visible viewport. Incompatible with uid.', | ||
| type ScreenshotFormat = 'png' | 'jpeg' | 'webp'; | ||
|
|
||
| function isScreenshotFormat(value: unknown): value is ScreenshotFormat { |
There was a problem hiding this comment.
why do we need isScreenshotFormat? MCP schema validates the format.
There was a problem hiding this comment.
Good catch, removed. The yargs choices: ['jpeg', 'png', 'webp'] as const already narrows args.screenshotFormat to 'jpeg' | 'png' | 'webp' | undefined, so the runtime guard was dead code. Now simply const defaultFormat: ScreenshotFormat = screenshotFormat ?? 'png';.
| } | ||
|
|
||
| function isPositiveFiniteNumber(value: number | undefined): value is number { | ||
| return value !== undefined && Number.isFinite(value) && value > 0; |
There was a problem hiding this comment.
let's do number validation when parsing CLI args with yargs? I think here we should expect an integer number or undefined so we do not need to check again if the CLI args were parse.
There was a problem hiding this comment.
Done. Moved the validation into coerce on each numeric CLI option in chrome-devtools-mcp-cli-options.ts:
screenshotQuality: integer in[0, 100]screenshotMaxWidth/screenshotMaxHeight: positive integer
Invalid values now throw a clear error at parse time (e.g. Invalid screenshotMaxWidth -5. Expected a positive integer.), and the tool consumes the args directly with no re-validation.
Adds opt-in CLI flags so operators can cap the size of screenshots
returned by `take_screenshot` before they are embedded in the MCP
response. Addresses two related symptoms reported when MCP clients
display screenshots inline:
1. The hosted LLM API rejects images exceeding its per-image dimension
limits (e.g. Anthropic's 8000x8000 px / 2000x2000 px when >20
images are in the same request).
2. After many captures the cumulative base64 payload pushes the
request over the per-call body size limit.
Both can be mitigated at the source by reducing format/quality and
downscaling the capture.
New CLI flags (all opt-in, no behavior change when unset):
- --screenshot-format <jpeg|png|webp>: override the default format
used by take_screenshot when the caller does not specify one.
- --screenshot-quality <0-100>: override the default JPEG/WebP
quality when the caller does not specify one. Ignored for PNG.
- --screenshot-max-width <px>: downscale screenshots wider than this
before they are returned.
- --screenshot-max-height <px>: downscale screenshots taller than
this before they are returned. Combines with --screenshot-max-width;
the smaller scale wins so both bounds are respected while preserving
aspect ratio.
Resizing leverages Puppeteer's clip.scale (CDP Page.captureScreenshot)
so no new dependencies are introduced. Source dimensions are computed
per capture mode:
- viewport: page.viewport()
- full page: document.documentElement.scrollWidth/scrollHeight via
page.evaluate()
- element (uid): elementHandle.boundingBox()
For element and full-page captures with a downscale clip, the call is
routed through page.screenshot({clip}) so the scale parameter applies.
captureBeyondViewport is left to Puppeteer's default (true when a clip
is set), which preserves correct behavior for elements below the fold
and for full-page captures.
Design notes:
- Aligned with the "Reference over Value" principle in
docs/design-principles.md: the existing 2 MB threshold still routes
oversized screenshots to a temporary file. This change only reduces
the size of the inline base64 fallback path, which the principles
document calls out as an acceptable exception when MCP clients
display images natively.
- Fully opt-in: when no flags are set, take_screenshot returns the
exact same bytes as before. No breaking change.
- The MCP server hardcodes no LLM-specific size limits — operators
pick the values that match their client/model combination. This
keeps the maintenance surface minimal as model limits evolve and
is intended as a complement to, not a replacement for, fixes in
the MCP client itself.
- Compares against CSS pixels (page.viewport()), not raw bitmap
pixels, so HiDPI emulation behaves predictably from the user's
perspective.
Tests added (6 new):
- honors screenshotFormat default from CLI args
- keeps "png" as default format when no CLI override is set
- downscales viewport screenshot when screenshotMaxWidth is set
- downscales using the smaller scale when both max-width and
max-height are set
- does not resize when source is smaller than the max bounds
- downscales full page screenshot when screenshotMaxWidth is set
Refs ChromeDevTools#879
f09e1ea to
8c76d72
Compare
|
Rebased on |
|
Ok
Em seg., 1 de jun. de 2026 às 05:13, Antoine Kingue <
***@***.***> escreveu:
… ***@***.**** commented on this pull request.
------------------------------
In src/tools/screenshot.ts
<#1823 (comment)>
:
> - .describe(
- 'The uid of an element on the page from the page content snapshot. If omitted, takes a page screenshot.',
- ),
- fullPage: zod
- .boolean()
- .optional()
- .describe(
- 'If set to true takes a screenshot of the full page instead of the currently visible viewport. Incompatible with uid.',
+type ScreenshotFormat = 'png' | 'jpeg' | 'webp';
+
+function isScreenshotFormat(value: unknown): value is ScreenshotFormat {
+ return value === 'png' || value === 'jpeg' || value === 'webp';
+}
+
+function isPositiveFiniteNumber(value: number | undefined): value is number {
+ return value !== undefined && Number.isFinite(value) && value > 0;
Done. Moved the validation into coerce on each numeric CLI option in
chrome-devtools-mcp-cli-options.ts:
- screenshotQuality: integer in [0, 100]
- screenshotMaxWidth / screenshotMaxHeight: positive integer
Invalid values now throw a clear error at parse time (e.g. Invalid
screenshotMaxWidth -5. Expected a positive integer.), and the tool
consumes the args directly with no re-validation.
—
Reply to this email directly, view it on GitHub
<#1823?email_source=notifications&email_token=CDNASIPMN2MJBVQ3CCGLEWL45U3KNA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMZZHE2TGNJVGY22M4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSWGM33PORSXEX3DNRUWG2Y#discussion_r3332629536>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/CDNASIII6DLSFWL6SNUBYJ345U3KNAVCNFSM6AAAAACXPFRSPCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHM2DGOJZGUZTKNJWGU>
.
Triage notifications, keep track of coding agent tasks and review pull
requests on the go with GitHub Mobile for iOS
<https://github.com/notifications/mobile/ios/CDNASIIGBMALJ6OCFSQRXUL45U3KNA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMZZHE2TGNJVGY22M4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSVGM33PORSXEX3JN5ZQ>
and Android
<https://github.com/notifications/mobile/android/CDNASINMJTBM5KKJM2SDH6L45U3KNA5CNFSNUABKM5UWIORPF5TWS5BNNB2WEL2QOVWGYUTFOF2WK43UKJSXM2LFO4XTIMZZHE2TGNJVGY22M4TFMFZW63VKON2WE43DOJUWEZLEUVSXMZLOOSXGM33PORSXEX3BNZSHE33JMQ>.
Download it today!
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***
.com>
|
Summary
Adds opt-in CLI flags so operators can cap the size of screenshots returned by
take_screenshotbefore they are embedded in the MCP response. Refs #879.The flags address two related symptoms reported when MCP clients display screenshots inline:
Both can be mitigated at the source by reducing format/quality and downscaling the capture.
New flags (all opt-in)
--screenshot-format <jpeg|png|webp>: override the default format used bytake_screenshotwhen the caller does not specify one--screenshot-quality <0-100>: override the default JPEG/WebP quality. Ignored for PNG--screenshot-max-width <px>: downscale screenshots wider than this before they are returned--screenshot-max-height <px>: downscale screenshots taller than this. Combines with--screenshot-max-width; the smaller scale wins so both bounds are respected while preserving aspect ratioFor the exact error in #879, the recipe is
--screenshot-max-width=8000 --screenshot-max-height=8000(or a smaller value such as2000if many images may end up in the same request, depending on the operator's chosen API).Implementation
clip.scale(CDPPage.captureScreenshot), so no new dependencies.page.viewport()document.documentElement.scrollWidth/scrollHeightviapage.evaluate()uid):elementHandle.boundingBox()page.screenshot({clip})so the scale parameter applies.captureBeyondViewportis left to Puppeteer's default (truewhen a clip is set), preserving correct behavior for elements below the fold and full-page captures.Backwards compatibility
Fully opt-in: when no flags are set,
take_screenshotreturns the exact same bytes as before. No behavioral change for existing users.Design alignment
docs/design-principles.md: the existing 2 MB threshold still routes oversized screenshots to a temporary file. This change only reduces the size of the inline base64 fallback path, which the principles document calls out as an acceptable exception when MCP clients display images natively.Addressing concerns raised in #879
The flags are pure parameters; nothing about the upstream LLM is encoded in the server. When a vendor raises (or lowers) a limit, no code change is needed here, only the operator's CLI args change.
filePathis great when the call site knows it's about to take a huge screenshot, but as you noted earlier in the thread, an oversized image already in the request history keeps causing failures even on subsequent calls.page_resizeworks but mutates the page being debugged. The resize in this PR happens between Puppeteer and the MCP response, so the inspected page is untouched and the failure mode is prevented at the source.Agreed, this PR is intended as a complement, not a substitute. A client-side fix (e.g. compaction evicts/downsamples old images) handles the cumulative case for any MCP. A server-side cap handles the per-call dimension limit for users who hit it before compaction can kick in. The two address overlapping but distinct failure modes.
Happy to drop or rework any of this if the maintainers prefer a different shape, for example making the threshold automatic from a single
--max-image-bytesknob, or rejecting the PR entirely in favor of waiting for a client-side fix. Just wanted to put a concrete option on the table.Tests
Added 6 new tests:
honors screenshotFormat default from CLI argskeeps "png" as default format when no CLI override is setdownscales viewport screenshot when screenshotMaxWidth is setdownscales using the smaller scale when both max-width and max-height are setdoes not resize when source is smaller than the max boundsdownscales full page screenshot when screenshotMaxWidth is setAll 627 tests in the suite pass.
npm run typecheckandnpm run check-formatare clean.Notes for reviewers
--screenshot-max-width/heightare CSS pixels (page.viewport()), not raw bitmap pixels. WithdeviceScaleFactor > 1(HiDPI emulation) the actual bitmap may still be larger. Happy to clarify this in the option description if preferred.page.screenshot({clip})instead ofelement.screenshot(). Same-frame elements are correct (boundingBox returns main-frame coords). I have not exercised this path against cross-origin iframe elements; let me know if you'd like a fallback there.Refs #879