Skip to content

feat: Google Drive upload history#2305

Merged
aalemayhu merged 5 commits into
mainfrom
feat/google-drive-upload-history
May 15, 2026
Merged

feat: Google Drive upload history#2305
aalemayhu merged 5 commits into
mainfrom
feat/google-drive-upload-history

Conversation

@aalemayhu
Copy link
Copy Markdown
Contributor

@aalemayhu aalemayhu commented May 15, 2026

What

Adds a "From Google Drive" section to the Downloads page below "From Dropbox". Authenticated users see the files they've picked from Drive with file icon, formatted size, relative time since last conversion, an "Open in Drive ↗" link, and a × to remove the row from history. The Drive file itself is never touched.

A last_converted_at timestamptz column lands on google_drive_uploads via an additive migration, and the existing saveFiles upsert UPDATE branch now refreshes it on every re-conversion so the recency sort reflects the user's actual conversion history, not the first time they picked the file.

Why

The google_drive_uploads table has been write-only since August 2024 — Drive users have repeatedly re-picked the same file from the Drive Picker because nothing surfaced their history. This closes the loop for returning Drive users, the same way #2300 did for Dropbox. Target: +5pp return-upload rate among users with google_drive_uploads rows within 4 weeks. Phase-2 gate: ≥15% of section viewers click "Open in Drive" before we invest in one-click re-convert.

How

  • Migration: additive last_converted_at timestamptz default now() (nullable so historical rows keep NULL and sort last) + owner index.
  • GoogleDriveRepository.getByOwner: selects only the columns the UI needs, filters mimeType != 'application/vnd.google-apps.folder', orders by last_converted_at DESC NULLS LAST.
  • GoogleDriveRepository.deleteByIdAndOwner(id: string, owner: number): parameterized; controller pre-validates the string PK with /^[A-Za-z0-9_-]+$/ so anything outside the Drive ID alphabet is rejected at the boundary.
  • GoogleDriveRepository.saveFiles upsert UPDATE path now sets last_converted_at = now() — otherwise repeat converters would still see their first-upload timestamp.
  • GetGoogleDriveUploadsUseCase + DeleteGoogleDriveUploadUseCase: map rows to an explicit typed response (no raw DB rows through res.json).
  • UploadController.getGoogleDriveUploads + .deleteGoogleDriveUpload: owner always from res.locals; URL/query/body never trusted for owner.
  • Routes: GET /api/upload/google_drive/mine and DELETE /api/upload/google_drive/mine/:id behind RequireAuthentication.
  • Web: useGoogleDriveUploads hook, GoogleDriveHistorySection, GoogleDriveHistoryEntry. Section hidden when empty; error state matches VOICE.md. Icon src is sanitized to a small allowlist of Google CDNs (drive-thirdparty.googleusercontent.com, ssl.gstatic.com, lh3.googleusercontent.com) with onError swap to /icons/file-generic.svg. The "Open in Drive" href is sanitized to drive.google.com and docs.google.com https origins; any other URL becomes a disabled "Link unavailable" span. Size formatter handles Kanel's string type for sizeBytes.

Open question resolutions:

  • Q1 (url safety): Trusted only after origin allowlist + https: check at render time. No OAuth tokens have ever been stored in the url column historically; the allowlist is defense in depth.
  • Q2 (owner index): Added in the migration — missing on the original 2024 create migration.
  • Q3 (paywall): History available to all authenticated users — matches the Dropbox phase-1 decision so the Downloads page stays consistent.

Measuring success

GET /api/upload/google_drive/mine returning 200s in server logs for authenticated sessions. Secondary: return-upload rate among users with google_drive_uploads rows shows +5pp within 4 weeks; ≥15% of section viewers click "Open in Drive" in their first session (validation gate for phase 2 / re-convert).

Testing

  • Server (Jest, all green): GoogleDriveRepository.test.ts (6 — owner guards on get+delete, folder exclusion, last_converted_at ordering, parameterized id with crafted string), GetGoogleDriveUploadsUseCase.test.ts (3 — shape mapping that drops owner/description/embedUrl, passthrough, empty), DeleteGoogleDriveUploadUseCase.test.ts (3 — delegation, 404 on 0 deleted, success), GoogleDriveController.test.ts (8 — 401 guards, 400 on missing/invalid string id, 404 on missing row, 200 with clean id).
  • Web (Vitest, all green): useGoogleDriveUploads.test.tsx (4 — loading→empty, populated with hasMore, error path, deleteUpload by string key). DownloadsPage.test.tsx updated to mock the new hook.
  • /check: server tsc clean, web typecheck clean, 476 Vitest tests pass, Biome lint clean.
  • Full server suite: 1214 tests passing under src/.

Risks

  • last_converted_at is not yet reflected in Kanel-generated GoogleDriveUploads.ts. The repository defines its own GoogleDriveUploadRow type for the read path, and the UPDATE in saveFiles uses database.fn.now() directly — no Kanel-typed write field. pnpm kanel should be run on the dev DB after the migration applies to refresh the type.
  • Rollback: the migration has a clean down (drop index then column). The API routes are additive — removing them is safe.

Sonar

Sonar scanner not run locally (token not configured in this environment) — flagging for reviewers so a Sonar bounce isn't a surprise.

Goal alignment

Direct line to the 300K-user goal: re-conversions are the fastest way to grow weekly conversions per user, and the Drive Picker step is the dominant friction for returning users with embedded slides / Docs. The feature is intentionally read-first so we can validate the click-through gate before building the (harder) OAuth re-auth re-convert.


Trio synthesis (from original spec PR #2214)
  • PM: Read-only history list with "Open in Drive" link; "Convert again" deferred (OAuth token not stored); gate on ≥15% click-through before building re-convert.
  • Designer: Fourth section on /downloads, below "From Dropbox", using a table layout (File / Size / Added / Actions) matching FinishedJobs; hide when empty; sanitize url to Drive origins only.
  • Engineer: S+ effort; string PK needs regex validation; upsert semantics make created_at alone misleading — last_converted_at is the right recency signal; sizeBytes arrives as string from Kanel.
  • Conflict: PM suggested sorting by lastEditedUtc DESC; Engineer chose last_converted_at. Resolution: last_converted_at — Drive's edit time is the file owner's clock, not the user's conversion history.
  • Resolution: PR 1 (this PR) ships read + delete + last_converted_at migration. PR 2 (re-convert) deferred until phase-1 metrics clear the gate.

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

aalemayhu and others added 5 commits May 15, 2026 23:58
The google_drive_uploads table has been write-only since August 2024.
This spec defines a two-phase delivery: PR 1 surfaces the history on
the Downloads page (read + delete + last_converted_at migration); PR 2
adds one-click re-convert once the OAuth re-auth story is worked out.
Adds GET /api/upload/google_drive/mine and DELETE /api/upload/google_drive/mine/:id
guarded by RequireAuthentication. Owner is read from res.locals; controller
validates the string PK against /^[A-Za-z0-9_-]+$/ before reaching the repository.

Migration adds a nullable last_converted_at timestamptz to google_drive_uploads
plus an owner index. The saveFiles upsert UPDATE branch now sets last_converted_at
= now() so the recency sort reflects re-conversions, not first upload.

GetByOwner excludes mimeType = 'application/vnd.google-apps.folder' and orders
by last_converted_at DESC NULLS LAST. The use case maps rows to an explicit typed
response shape (no raw DB rows through res.json).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Renders a table (File / Size / Added / Actions) below the Dropbox section.
Hidden when empty. Each row: file icon with onError fallback to a generic
SVG, filename, formatted size (handles Kanel's string bigint), relative
time from last_converted_at, "Open in Drive ↗" external link, × remove.

Icon and link sources are both sanitized — only Google CDN hosts render
inline; only drive.google.com and docs.google.com URLs become live links.
Anything else falls back to the generic icon or a disabled "Link unavailable"
span. List defaults to 10 newest with "Show older Google Drive files" to
expand by 20.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Every user-visible PR ships its What's New line in the same PR — that way
the page is current the moment the feature lands, and the entry doesn't
get forgotten in a later backfill.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The repo policy keeps Documentation/specs/ small — specs live only while
in flight. The spec text remains recoverable from this branch's history
via `git log -p -- Documentation/specs/google-drive-upload-history.md`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aalemayhu aalemayhu merged commit b791895 into main May 15, 2026
9 checks passed
@aalemayhu aalemayhu deleted the feat/google-drive-upload-history branch May 15, 2026 22:16
@sonarqubecloud
Copy link
Copy Markdown

aalemayhu added a commit that referenced this pull request May 15, 2026
Adds a third "Google Drive" tab to the upload form alongside "Your computer"
and "Dropbox". Clicking "Choose from Google Drive" opens the Google Picker
(loaded on demand from Google's CDN — no npm dep), requests an access token
scoped to drive.file (per-file authorization, narrowest possible), and POSTs
the picked file + bearer token to the existing POST /api/upload/google_drive
endpoint. The same conversion + downloads flow as Dropbox runs from there.

The new useGooglePicker hook lazy-loads both apis.google.com/js/api.js
(gapi/Picker) and accounts.google.com/gsi/client (Google Identity Services
token model), then opens the Picker inside the token callback — opening it
outside would race the token initialization and silently fail.

Tab is gated on REACT_APP_GOOGLE_CLIENT_ID and REACT_APP_GOOGLE_API_KEY being
present. When either is missing the tab is hidden, so deploys without the
env vars degrade gracefully to the existing two-tab layout. Server side
needs no changes — the endpoint, repository, and migration shipped in #2300
and #2305.

Closes the empty "From Google Drive" section on Downloads that #2305 added —
without this PR, nothing populates the google_drive_uploads table from the
web app.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
aalemayhu added a commit that referenced this pull request May 15, 2026
## What
Adds \`https://apis.google.com\` and \`https://accounts.google.com\` to
the \`script-src\` directive of the CSP meta tag in \`web/index.html\`.
Required for the Google Drive upload tab from #2306 to function.

## Why
Browser console on prod after deploying #2306 showed:

\`\`\`
Loading the script 'https://apis.google.com/js/api.js' violates the
following Content Security Policy directive: "script-src 'self'
'unsafe-inline' https://www.googletagmanager.com
https://www.google-analytics.com https://static.hotjar.com
https://script.hotjar.com https://www.dropbox.com"
\`\`\`

Same violation for \`accounts.google.com/gsi/client\`. Both are required
by the Picker / GIS flow that \`useGooglePicker\` lazy-loads.

## How
One-line addition to the existing CSP meta tag's \`script-src\`. No
other directives needed — frame/connect/img directives are not set on
this site, so the Picker iframe and XHR calls were never blocked.

## Testing
- Local: rebuild + smoke-test the upload form. Tab opens the Picker.
- Prod (after rebuild): browser console shows no CSP violations for the
two Google origins.

## Risks
- Two new third-party script origins in script-src — both are
Google-owned and serve the official Picker SDK; this is the same trust
boundary as the existing Dropbox SDK allowance (\`www.dropbox.com\`).
- Rollback: revert this commit. The CSP returns to its prior shape; the
Drive tab silently fails to load Picker, same as before this fix.

## Goal alignment
Unblocks #2306, which closes the empty "From Google Drive" history
section #2305 introduced.

<!-- codesmith:footer -->
---
<a
href="https://app.blacksmith.sh/2anki/codesmith/server/pr/2307"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-light.svg"><img
alt="View in Codesmith"
src="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"></picture></a>
<sup>Need help on this PR? Tag <code>@codesmith</code> with what you
need.</sup>

- [ ] Let Codesmith autofix CI failures and bot reviews
<!-- /codesmith:footer -->

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant