Skip to content

feat: Google Drive upload history#2214

Closed
aalemayhu wants to merge 1 commit into
mainfrom
docs/spec-google-drive-upload-history
Closed

feat: Google Drive upload history#2214
aalemayhu wants to merge 1 commit into
mainfrom
docs/spec-google-drive-upload-history

Conversation

@aalemayhu
Copy link
Copy Markdown
Contributor

@aalemayhu aalemayhu commented May 14, 2026

What

Adds a "From Google Drive" section to the Downloads page below "From Dropbox". Authenticated users see the files they've picked from Drive with file icon, formatted size, relative time since last conversion, an "Open in Drive ↗" link, and a × to remove the row from history. The Drive file itself is never touched.

A last_converted_at timestamptz column lands on google_drive_uploads via an additive migration, and the existing saveFiles upsert UPDATE branch now refreshes it on every re-conversion so the recency sort reflects the user's actual conversion history, not the first time they picked the file.

Why

The google_drive_uploads table has been write-only since August 2024 — Drive users have repeatedly re-picked the same file from the Drive Picker because nothing surfaced their history. This closes the loop for returning Drive users, the same way #2300 did for Dropbox. Target: +5pp return-upload rate among users with google_drive_uploads rows within 4 weeks. Phase-2 gate: ≥15% of section viewers click "Open in Drive" before we invest in one-click re-convert.

How

  • Migration: additive last_converted_at timestamptz default now() (nullable so historical rows keep NULL and sort last) + owner index.
  • GoogleDriveRepository.getByOwner: selects only the columns the UI needs, filters mimeType != 'application/vnd.google-apps.folder', orders by last_converted_at DESC NULLS LAST.
  • GoogleDriveRepository.deleteByIdAndOwner(id: string, owner: number): parameterized; controller pre-validates the string PK with /^[A-Za-z0-9_-]+$/ so anything outside the Drive ID alphabet is rejected at the boundary.
  • GoogleDriveRepository.saveFiles upsert UPDATE path now sets last_converted_at = now() — otherwise repeat converters would still see their first-upload timestamp.
  • GetGoogleDriveUploadsUseCase + DeleteGoogleDriveUploadUseCase: map rows to an explicit typed response (no raw DB rows through res.json).
  • UploadController.getGoogleDriveUploads + .deleteGoogleDriveUpload: owner always from res.locals; URL/query/body never trusted for owner.
  • Routes: GET /api/upload/google_drive/mine and DELETE /api/upload/google_drive/mine/:id behind RequireAuthentication.
  • Web: useGoogleDriveUploads hook, GoogleDriveHistorySection, GoogleDriveHistoryEntry. Section hidden when empty; error state matches VOICE.md. Icon src is sanitized to a small allowlist of Google CDNs (drive-thirdparty.googleusercontent.com, ssl.gstatic.com, lh3.googleusercontent.com) with onError swap to /icons/file-generic.svg. The "Open in Drive" href is sanitized to drive.google.com and docs.google.com https origins; any other URL becomes a disabled "Link unavailable" span. Size formatter handles Kanel's string type for sizeBytes.

Open question resolutions:

  • Q1 (url safety): Trusted only after origin allowlist + https: check at render time. No OAuth tokens have ever been stored in the url column historically; the allowlist is defense in depth.
  • Q2 (owner index): Added in the migration — missing on the original 2024 create migration.
  • Q3 (paywall): History available to all authenticated users — matches the Dropbox phase-1 decision so the Downloads page stays consistent.

Measuring success

GET /api/upload/google_drive/mine returning 200s in server logs for authenticated sessions. Secondary: return-upload rate among users with google_drive_uploads rows shows +5pp within 4 weeks; ≥15% of section viewers click "Open in Drive" in their first session (validation gate for phase 2 / re-convert).

Testing

  • Server (Jest, all green): GoogleDriveRepository.test.ts (6 — owner guards on get+delete, folder exclusion, last_converted_at ordering, parameterized id with crafted string), GetGoogleDriveUploadsUseCase.test.ts (3 — shape mapping that drops owner/description/embedUrl, passthrough, empty), DeleteGoogleDriveUploadUseCase.test.ts (3 — delegation, 404 on 0 deleted, success), GoogleDriveController.test.ts (8 — 401 guards, 400 on missing/invalid string id, 404 on missing row, 200 with crafted clean id).
  • Web (Vitest, all green): useGoogleDriveUploads.test.tsx (4 — loading→empty, populated with hasMore, error path, deleteUpload by string key). DownloadsPage.test.tsx updated to mock the new hook.
  • /check: server tsc clean, web typecheck clean, 476 Vitest tests pass, Biome lint clean.
  • Full server suite: 1214 tests passing under src/.

Risks

  • last_converted_at is not yet reflected in Kanel-generated GoogleDriveUploads.ts. The repository defines its own GoogleDriveUploadRow type for the read path, and the UPDATE in saveFiles uses database.fn.now() directly — no Kanel-typed write field. pnpm kanel should be run on the dev DB after the migration applies to refresh the type.
  • Rollback: the migration has a clean down (drop index then column). The API routes are additive — removing them is safe.

Sonar

Sonar scanner not run locally (token not configured in this environment) — flagging for reviewers so a Sonar bounce isn't a surprise.

Goal alignment

Direct line to the 300K-user goal: re-conversions are the fastest way to grow weekly conversions per user, and the Drive Picker step is the dominant friction for returning users with embedded slides / Docs. The feature is intentionally read-first so we can validate the click-through gate before building the (harder) OAuth-re-auth re-convert.


Trio synthesis (from original spec PR #2214)
  • PM: Read-only history list with "Open in Drive" link; "Convert again" deferred (OAuth token not stored); gate on ≥15% click-through before building re-convert.
  • Designer: Fourth section on /downloads, below "From Dropbox", using a table layout (File / Size / Added / Actions) matching FinishedJobs; hide when empty; sanitize url to Drive origins only.
  • Engineer: S+ effort; string PK needs regex validation; upsert semantics make created_at alone misleading — last_converted_at is the right recency signal; sizeBytes arrives as string from Kanel.
  • Conflict: PM suggested sorting by lastEditedUtc DESC; Engineer chose last_converted_at. Resolution: last_converted_at — Drive's edit time is the file owner's clock, not the user's conversion history.
  • Resolution: PR 1 (this PR) ships read + delete + last_converted_at migration. PR 2 (re-convert) deferred until phase-1 metrics clear the gate.

The google_drive_uploads table has been write-only since August 2024.
This spec defines a two-phase delivery: PR 1 surfaces the history on
the Downloads page (read + delete + last_converted_at migration); PR 2
adds one-click re-convert once the OAuth re-auth story is worked out.
@sonarqubecloud
Copy link
Copy Markdown

@aalemayhu aalemayhu closed this May 15, 2026
@aalemayhu aalemayhu deleted the docs/spec-google-drive-upload-history branch May 15, 2026 21:57
@aalemayhu aalemayhu changed the title spec: Google Drive upload history feat: Google Drive upload history May 15, 2026
@aalemayhu
Copy link
Copy Markdown
Contributor Author

Superseded by #2305 — the original draft branch was renamed from docs/spec-google-drive-upload-history to feat/google-drive-upload-history and GitHub auto-closed this PR. The new PR carries the full implementation and an updated body. Spec lifecycle commits preserved on the new branch.

aalemayhu added a commit that referenced this pull request May 15, 2026
## What
Adds a "From Google Drive" section to the Downloads page below "From
Dropbox". Authenticated users see the files they've picked from Drive
with file icon, formatted size, relative time since last conversion, an
"Open in Drive ↗" link, and a × to remove the row from history. The
Drive file itself is never touched.

A `last_converted_at timestamptz` column lands on `google_drive_uploads`
via an additive migration, and the existing `saveFiles` upsert UPDATE
branch now refreshes it on every re-conversion so the recency sort
reflects the user's actual conversion history, not the first time they
picked the file.

## Why
The `google_drive_uploads` table has been write-only since August 2024 —
Drive users have repeatedly re-picked the same file from the Drive
Picker because nothing surfaced their history. This closes the loop for
returning Drive users, the same way #2300 did for Dropbox. Target: +5pp
return-upload rate among users with `google_drive_uploads` rows within 4
weeks. Phase-2 gate: ≥15% of section viewers click "Open in Drive"
before we invest in one-click re-convert.

## How
- Migration: additive `last_converted_at timestamptz default now()`
(nullable so historical rows keep NULL and sort last) + `owner` index.
- `GoogleDriveRepository.getByOwner`: selects only the columns the UI
needs, filters `mimeType != 'application/vnd.google-apps.folder'`,
orders by `last_converted_at DESC NULLS LAST`.
- `GoogleDriveRepository.deleteByIdAndOwner(id: string, owner: number)`:
parameterized; controller pre-validates the string PK with
`/^[A-Za-z0-9_-]+$/` so anything outside the Drive ID alphabet is
rejected at the boundary.
- `GoogleDriveRepository.saveFiles` upsert UPDATE path now sets
`last_converted_at = now()` — otherwise repeat converters would still
see their first-upload timestamp.
- `GetGoogleDriveUploadsUseCase` + `DeleteGoogleDriveUploadUseCase`: map
rows to an explicit typed response (no raw DB rows through `res.json`).
- `UploadController.getGoogleDriveUploads` + `.deleteGoogleDriveUpload`:
owner always from `res.locals`; URL/query/body never trusted for owner.
- Routes: `GET /api/upload/google_drive/mine` and `DELETE
/api/upload/google_drive/mine/:id` behind `RequireAuthentication`.
- Web: `useGoogleDriveUploads` hook, `GoogleDriveHistorySection`,
`GoogleDriveHistoryEntry`. Section hidden when empty; error state
matches VOICE.md. Icon `src` is sanitized to a small allowlist of Google
CDNs (`drive-thirdparty.googleusercontent.com`, `ssl.gstatic.com`,
`lh3.googleusercontent.com`) with `onError` swap to
`/icons/file-generic.svg`. The "Open in Drive" `href` is sanitized to
`drive.google.com` and `docs.google.com` https origins; any other URL
becomes a disabled "Link unavailable" span. Size formatter handles
Kanel's `string` type for `sizeBytes`.

**Open question resolutions:**
- Q1 (`url` safety): Trusted only after origin allowlist + `https:`
check at render time. No OAuth tokens have ever been stored in the `url`
column historically; the allowlist is defense in depth.
- Q2 (`owner` index): Added in the migration — missing on the original
2024 create migration.
- Q3 (paywall): History available to all authenticated users — matches
the Dropbox phase-1 decision so the Downloads page stays consistent.

## Measuring success
`GET /api/upload/google_drive/mine` returning 200s in server logs for
authenticated sessions. Secondary: return-upload rate among users with
`google_drive_uploads` rows shows +5pp within 4 weeks; ≥15% of section
viewers click "Open in Drive" in their first session (validation gate
for phase 2 / re-convert).

## Testing
- Server (Jest, all green): `GoogleDriveRepository.test.ts` (6 — owner
guards on get+delete, folder exclusion, last_converted_at ordering,
parameterized id with crafted string),
`GetGoogleDriveUploadsUseCase.test.ts` (3 — shape mapping that drops
owner/description/embedUrl, passthrough, empty),
`DeleteGoogleDriveUploadUseCase.test.ts` (3 — delegation, 404 on 0
deleted, success), `GoogleDriveController.test.ts` (8 — 401 guards, 400
on missing/invalid string id, 404 on missing row, 200 with clean id).
- Web (Vitest, all green): `useGoogleDriveUploads.test.tsx` (4 —
loading→empty, populated with hasMore, error path, deleteUpload by
string key). `DownloadsPage.test.tsx` updated to mock the new hook.
- `/check`: server tsc clean, web typecheck clean, 476 Vitest tests
pass, Biome lint clean.
- Full server suite: 1214 tests passing under `src/`.

## Risks
- `last_converted_at` is not yet reflected in Kanel-generated
`GoogleDriveUploads.ts`. The repository defines its own
`GoogleDriveUploadRow` type for the read path, and the UPDATE in
`saveFiles` uses `database.fn.now()` directly — no Kanel-typed write
field. `pnpm kanel` should be run on the dev DB after the migration
applies to refresh the type.
- Rollback: the migration has a clean `down` (drop index then column).
The API routes are additive — removing them is safe.

## Sonar
Sonar scanner not run locally (token not configured in this environment)
— flagging for reviewers so a Sonar bounce isn't a surprise.

## Goal alignment
Direct line to the 300K-user goal: re-conversions are the fastest way to
grow weekly conversions per user, and the Drive Picker step is the
dominant friction for returning users with embedded slides / Docs. The
feature is intentionally read-first so we can validate the click-through
gate before building the (harder) OAuth re-auth re-convert.

---

<details>
<summary>Trio synthesis (from original spec PR #2214)</summary>

- **PM:** Read-only history list with "Open in Drive" link; "Convert
again" deferred (OAuth token not stored); gate on ≥15% click-through
before building re-convert.
- **Designer:** Fourth section on `/downloads`, below "From Dropbox",
using a table layout (File / Size / Added / Actions) matching
FinishedJobs; hide when empty; sanitize `url` to Drive origins only.
- **Engineer:** S+ effort; string PK needs regex validation; upsert
semantics make `created_at` alone misleading — `last_converted_at` is
the right recency signal; `sizeBytes` arrives as string from Kanel.
- **Conflict:** PM suggested sorting by `lastEditedUtc DESC`; Engineer
chose `last_converted_at`. **Resolution:** `last_converted_at` — Drive's
edit time is the file owner's clock, not the user's conversion history.
- **Resolution:** PR 1 (this PR) ships read + delete +
`last_converted_at` migration. PR 2 (re-convert) deferred until phase-1
metrics clear the gate.
</details>

<!-- codesmith:footer -->
---
<a
href="https://app.blacksmith.sh/2anki/codesmith/server/pr/2305"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-light.svg"><img
alt="View in Codesmith"
src="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"></picture></a>
<sup>Need help on this PR? Tag <code>@codesmith</code> with what you
need.</sup>

- [ ] Let Codesmith autofix CI failures and bot reviews
<!-- /codesmith:footer -->

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant