Stop Chrome downloading files a crawl links to by tamnd · Pull Request #33 · tamnd/kage

tamnd · 2026-06-16T02:41:57Z

An extensionless link is queued as a page, so the page worker navigated to it in headless Chrome. When the link served a binary, a zip or a CSV, Chrome saved the file to the user's Downloads folder, a surprise side effect of running a clone. This is issue #32.

Two layers of fix:

Deny Chrome-initiated downloads browser-wide. kage fetches every asset through its own downloader and never needs the browser to write a file, so a Chrome download is only ever an accident.
Watch the main document's response and, when it is not HTML, return a typed ErrNotHTML. The page worker catches it and reroutes the URL to the asset downloader, where the existing size and media policy decides whether to localise it or leave it on the live web. So defense in depth covers the case even if the deny call is unsupported on some Chrome build.

Verified manually against the two URLs from the issue. The zip and the CSV both land under the mirror's reserved tree as assets, and nothing is written to ~/Downloads.

Tests added: a unit table for the HTML content-type check, a browser integration test that asserts a zip and a CSV come back as ErrNotHTML while an HTML page still renders, and a clone integration test that a linked non-HTML target is fetched as an asset rather than saved as a page.

Refs #32

An extensionless link is queued as a page, so the page worker navigated to it in headless Chrome. When such a link served a binary, a zip or a CSV, Chrome saved the file to the user's Downloads folder, a surprise side effect of a clone (issue #32). Deny Chrome-initiated downloads browser-wide, since kage fetches every asset through its own downloader and never needs the browser to write a file. Then watch the main document's response, and when it is not HTML, return a typed ErrNotHTML so the page worker reroutes the URL to the asset downloader, where the existing size and media policy decides whether to localise it or leave it on the live web. Verified against the two URLs from the issue, a zip and a CSV: both land under the mirror's reserved tree and nothing is written to Downloads.

tamnd merged commit 5cbb7f8 into main Jun 16, 2026
9 checks passed

tamnd deleted the fix-download-deny branch June 16, 2026 03:13

tamnd mentioned this pull request Jun 16, 2026

Polution of user's "Downloads" folder #32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop Chrome downloading files a crawl links to#33

Stop Chrome downloading files a crawl links to#33
tamnd merged 1 commit into
mainfrom
fix-download-deny

tamnd commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tamnd commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant