Add bookshelf scanner tab by bechols · Pull Request #27 · bechols/bechols-dotcom

bechols · 2026-02-16T16:57:46Z

Summary

Adds /books/scan tab that uses the device camera + an on-device Vision Language Model (FastVLM-0.5B via WebGPU) to read book spines and match them against the want-to-read list
Works fully offline after initial ~300MB model download — model cached in browser Cache Storage, want-to-read list cached via React Query
State machine UI: idle → loading-model → camera-active → scanning → results, with streaming text output as the model reads spines

New files

lib/vlm-scanner.ts — VLM model loading/inference with dynamic imports and WebGPU support check
lib/book-matcher.ts — Jaccard-similarity fuzzy matching of extracted titles against want-to-read list
src/app/books/scan.tsx — Scan route with camera lifecycle, hydration guard, and match display

Modified files

src/app/books.tsx — 5th "Scan" tab
public/sw.js — Preserve HuggingFace transformers caches on SW update
package.json — @huggingface/transformers, @webgpu/types
tsconfig.json, eslint.config.js — WebGPU types and browser globals

Test plan

Navigate to /books/scan — idle state with "Enable Camera" and "Pre-load Model" buttons
Tap "Pre-load Model" — model downloads with progress feedback
Tap "Enable Camera" — rear camera preview appears
Point at bookshelf, tap scan button — frame freezes, text streams in
Results show matched want-to-read books (green) vs unmatched (gray)
"Scan Again" returns to camera preview
npm run build succeeds, npm run lint passes

🤖 Generated with Claude Code

Uses on-device VLM (FastVLM-0.5B via WebGPU) to read book spines from camera and match them against the want-to-read list. Works offline after initial model download. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-16T16:57:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
bechols-dotcom	Ready	Preview, Comment	Feb 17, 2026 4:29am

onnxruntime-node (143MB) and @img/sharp (16MB) are transitive deps of @huggingface/transformers that are only needed for Node.js inference, not browser WebGPU. Post-build cleanup removes them from the Nitro function output to stay under Vercel's 250MB limit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Use ref callback to attach media stream when video element mounts, fixing AbortError when camera starts after model download - Add per-file download progress (filename + percentage) to model loading UI via HuggingFace progress_callback Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

FastVLM-0.5B crashed browser tabs (OOM at ~605MB). Switch to tesseract.js — a lightweight WASM-based OCR engine (~6MB total) that runs on any browser. Serverless function drops from 250MB+ to 13.5MB. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Init Tesseract worker on page mount instead of during scan - Split scanBookshelf into captureFrame + recognizeFrame so frame is captured before React re-render swaps the video element - Share single <video> element across camera-active and scanning states to prevent unmount/remount losing video dimensions - Add no-force-push constraint to CLAUDE.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Tesseract struggles with rotated/angled book spine text. Switch to PP-OCRv4 via @gutenye/ocr-browser which uses a DB text detection model that handles text at any angle. Returns structured results with confidence scores instead of raw text. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rewrite bookshelf scanner from single-shot to continuous scanning that accumulates de-duped results. Add character bigram similarity for better OCR typo tolerance. Fix iOS Chrome hang by disabling WASM multi-threading. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The previous numThreads fix was in onnxOptions (session-level), but onnxruntime-web decides whether to spawn a web worker at import time. On iOS, the blob-URL worker can't fetch same-origin WASM files (CORS). Setting env.wasm.numThreads=1 globally prevents the worker entirely. Also cache .wasm files in the service worker. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Wrap globalThis.fetch during Ocr.create() to intercept model downloads and pipe them through a progress-tracking ReadableStream. Data flows through once — no extra memory copies. UI shows per-model download progress (e.g. "Downloading recognition model... 5.2/10.0 MB (52%)"). Also cache .wasm and .onnx files in the service worker for faster subsequent loads on production. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Camera: 1280x720 instead of 1920x1080 (less video memory) - Frame capture: 640px max instead of 1280px (plenty for OCR text) - Blob URL instead of base64 data URL (eliminates 33% encoding overhead) - JPEG at 0.8 quality instead of PNG (much smaller frame blobs) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…nail Disable WASM memory arena and allocation pattern caching to lower peak memory during OCR inference. Drop capture resolution to 480px and camera to 640x480. Replace full-width video with compact thumbnail so results are visible while scanning. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add bookshelf scanner tab at /books/scan

1338e85

Uses on-device VLM (FastVLM-0.5B via WebGPU) to read book spines from camera and match them against the want-to-read list. Works offline after initial model download. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel bot had a problem deploying to Preview February 16, 2026 16:58 Failure

vercel bot deployed to Preview February 16, 2026 17:10 View deployment

vercel bot deployed to Preview February 16, 2026 17:24 View deployment

vercel bot deployed to Preview February 16, 2026 17:39 View deployment

bechols force-pushed the feat/bookshelf-scanner branch from 11ff0ab to 9dcf0b7 Compare February 16, 2026 17:41

vercel bot deployed to Preview February 16, 2026 17:41 View deployment

vercel bot deployed to Preview February 16, 2026 17:47 View deployment

vercel bot deployed to Preview February 17, 2026 03:27 View deployment

vercel bot deployed to Preview February 17, 2026 03:44 View deployment

vercel bot deployed to Preview February 17, 2026 03:52 View deployment

vercel bot deployed to Preview February 17, 2026 04:03 View deployment

vercel bot deployed to Preview February 17, 2026 04:11 View deployment

vercel bot deployed to Preview February 17, 2026 04:29 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add bookshelf scanner tab#27

Add bookshelf scanner tab#27
bechols wants to merge 11 commits intomainfrom
feat/bookshelf-scanner

bechols commented Feb 16, 2026

Uh oh!

vercel bot commented Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bechols commented Feb 16, 2026

Summary

New files

Modified files

Test plan

Uh oh!

vercel bot commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Feb 16, 2026 •

edited

Loading