Fix /news failing when Chromium binary is not installed#30
Conversation
Adding .gitkeep for PR creation (default mode). This file will be removed when the task is complete. Issue: Payel-git-ol#29
chromium.executablePath() returns the expected binary path without verifying the file exists, so detectPlaywrightFetcher chose the Playwright fetcher even when 'npx playwright install' was never run. Every source then failed at launch with "browserType.launch: Executable doesn't exist at ...". - Verify the Chromium binary exists on disk before selecting Playwright; otherwise use the HTTP fallback fetcher. - Add ResilientNewsPageFetcher that switches to the HTTP fallback at runtime if launching the browser still fails, so a single missing browser no longer fails every source. - Add tests covering the launch-failure fallback and non-launch error propagation; clarify README wording. Fixes Payel-git-ol#29
Working session summaryThe work is complete. Here's a summary of what I did. SummaryIssue #29 — running Root cause
Fix (
|
🤖 Solution Draft LogThis log file contains the complete execution trace of the AI solution draft process. 💰 Cost: $1.677268📊 Context and tokens usage:
Total: (2.9K new + 55.8K cache writes + 2.0M cache reads) input tokens, 13.6K output tokens, $1.677268 cost 🤖 Models used:
📎 Log file uploaded as Gist (1205KB)Now working session is ended, feel free to review and add any feedback on the solution draft. |
✅ Ready to mergeThis pull request is now ready to be merged:
Monitored by hive-mind with --auto-restart-until-mergeable flag |
This reverts commit b3803ca.
Summary
Fixes #29 —
/newsreported every source asfailed: browserType.launch: Executable doesn't exist at ...and prompted to runnpx playwright install, collecting 0 items from all 51 sources.Root cause
detectPlaywrightFetcher()decided whether to use the headless-browser fetcher or the plain-HTTP fallback like this:chromium.executablePath()returns the expected location of the Chromium binary; it does not verify the file is actually present. On a machine wherenpx playwright installwas never run (the user's Windows setup), it returns a non-empty path to a missing file, so detection wrongly picked the Playwright fetcher — and then every source failed atchromium.launch()time. The HTTP fallback that already existed in the codebase was never reached.Fix
existsSync(chromium.executablePath())) before selecting Playwright; otherwise useHttpNewsPageFetcher.ResilientNewsPageFetcherwraps the Playwright fetcher and switches to the HTTP fallback the first time a browser-launch error occurs, so a partially-installed browser no longer fails every source.Tests
ResilientNewsPageFetcherfalls back to HTTP fetching on a launch error and does not retry the broken browser on subsequent calls.ResilientNewsPageFetcherrethrows non-launch errors (e.g.net::ERR_CONNECTION_REFUSED) instead of masking them.All 128 tests pass;
npm run typecheckandnpm run buildare clean.How to reproduce / verify
On a machine without the Chromium binary installed, run
/news. Before this change every source failed withExecutable doesn't exist; after it, the crawler transparently uses the HTTP fetcher and collects items from static pages.