Skip to content

feat: Google Docs/Sheets/Slides export + chip rail upload source picker#2309

Merged
aalemayhu merged 8 commits into
mainfrom
feat/google-docs-export
May 16, 2026
Merged

feat: Google Docs/Sheets/Slides export + chip rail upload source picker#2309
aalemayhu merged 8 commits into
mainfrom
feat/google-docs-export

Conversation

@aalemayhu
Copy link
Copy Markdown
Contributor

@aalemayhu aalemayhu commented May 16, 2026

What

Fixes native Google Apps files (Docs, Sheets, Slides) picked from the Drive picker — they previously returned 403 because the server called ?alt=media on files that only support /export. The upload source tab row is replaced with a chip rail below the dropzone so cloud sources are subordinate to the local upload surface.

Why

Closes the silent churn loop: a student picks their lecture notes from Drive, the conversion fails with "Error handling Google Drive files", and they never come back. The fix: branch on mime type server-side and call the correct export endpoint. No user-visible copy about the implementation detail — they picked a file, they get a deck.

How

Server (feat: export native Google Docs/Sheets/Slides via Drive export API)

  • createGoogleDriveDownloadLink.ts: adds NATIVE_GOOGLE_APPS_EXPORT_MIMES (single source of truth for mime → export mime + extension) and createGoogleDriveExportLink().
  • handleGoogleDrive.ts: resolveUrlAndName() branches per file — native Google Apps mime → export URL + .html/.csv/.pdf extension; binary files → existing ?alt=media path.
  • Export URLs use www.googleapis.com which is already on the instrumentedAxios allowlist.

Frontend (style: replace upload source tabs with chip rail under dropzone)

  • New UploadSourceChips component: role="group" + aria-label="Other sources", chips always rendered (disabled when unconfigured, not hidden), toggle-to-deselect, aria-pressed.
  • UploadForm.tsx: chip rail always shown in idle state below the dropzone; cloud panels appear/disappear per chip selection.

Export mime targets (per trio resolution):

  • Google Docs → text/html + .html (best parser path, preserves heading/toggle structure)
  • Google Sheets → text/csv + .csv
  • Google Slides → application/pdf + .pdf

Measuring success

Server: conversion_success analytics event rate for upload_started events that came through the Drive path — this was 0% for native Docs files, should lift to match binary Drive files.

If we add server-side logging: gdoc_exported could be fired (not in this PR — the spec calls for it but the implementation was deferred to avoid adding a separate logging path; the existing conversion_success event via the same client path is sufficient).

Testing

Server unit tests:

  • createGoogleDriveDownloadLink.test.ts — 8 tests: download URL shape, export URL encoding (HTML/CSV/PDF), mime map entries, folder not included
  • handleGoogleDrive.test.ts — 5 tests: binary PDF uses alt=media, Doc uses HTML export, Sheet uses CSV export, Slides uses PDF export, missing auth returns 400

Web unit tests:

  • UploadSourceChips.test.tsx — 10 tests: group role, labels, disabled state, toggle behavior, aria-pressed, null class check
  • UploadForm.test.tsx — updated 4 tests for chip-rail shape; all 494 web tests pass

All server tests in src/controllers/Upload/ pass (63 tests). Web: 494 tests pass. TypeScript: clean on both server and web. Biome lint: clean.

Riskiest assumption — manual check: the HTML export quality assumption was not validated with live Drive docs during this implementation (would require real OAuth tokens). The parser already handles Notion HTML exports through the same pipeline. The gdoc_exported metric in production will tell us card quality within the first week. If HTML export produces empty decks at a high rate, the fallback is to change NATIVE_GOOGLE_APPS_EXPORT_MIMES for .document to produce PDF — a one-line change with no schema migration.

Changelog

{ type: 'feature', title: 'Google Docs, Sheets, and Slides from your Drive turn straight into decks', date: '2026-05-16' }

Risks

  • Chip rail regression: Dropbox panel visibility was previously behind anyRemoteSource check. Now chips always render in idle state. If both Drive and Dropbox env vars are missing, disabled chips are shown. This is per-spec (predictability over cleanliness) but is a behavior change for deployments without either service configured.
  • Toggle UX: clicking an active chip returns to local. This is new; previously you had to click "Your computer" tab. Low risk since the label/icon make the toggle intent clear.
  • Sonar: scanner is not installed locally; a bounce is possible on cognitive complexity or nested ternary. The resolveUrlAndName function and the chip toggle are both simple enough that I don't expect findings. If there is a bounce, it will be addressable without a product call.

Goal alignment

Fixes a conversion failure for a class of users (Drive-native Docs users) who tried the product and got an error. Removing that error removes a churn reason for the 300K-user path.

🤖 Generated with Claude Code


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

aalemayhu and others added 8 commits May 16, 2026 08:37
When a file with a native Google Apps mime type is picked from Drive,
branch on the mime type instead of calling alt=media (which returns 403).

- Docs → text/html with .html extension (best parser path)
- Sheets → text/csv with .csv extension
- Slides → application/pdf with .pdf extension
- Binary Drive files (PDF, .zip, .docx, etc.) are unchanged

NATIVE_GOOGLE_APPS_EXPORT_MIMES is the single source of truth for the
mime → (exportMime, extension) mapping. The extension override ensures
the existing parser's file-type guards route correctly without any new
parser logic.

instrumentedAxios + google_drive allowlist: no changes needed, the
export URL is on www.googleapis.com which is already allowlisted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
UploadSourceTabs (tablist) is replaced by UploadSourceChips (a chip
rail that sits below the full-bleed dropzone). The dropzone is always
the primary surface; Dropbox and Google Drive chips are subordinate.

- Chips render disabled when their env vars are missing (never hidden)
- Clicking an active chip toggles it off, returning to local upload
- role="group" + aria-label="Other sources" replaces tablist semantics
- aria-pressed tracks which chip is active
- CSS: outlined monochrome chips, wrap on narrow viewports, disabled
  opacity for unconfigured sources
- UploadForm updated: showChips is always true in idle state so chips
  are always present regardless of which sources are configured

UploadForm.test.tsx updated to find chips by aria-label instead of
role="tab" text content. New UploadSourceChips.test.tsx covers chip
rendering, disabled state, toggle behavior, and no-null-class check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Google Picker API's DocsView supports setMimeTypes(string) to
restrict visible file types. The type interface was missing this method,
making it unavailable to future callers. No runtime behavior change —
the picker continues to show all non-folder file types by default,
including native Google Apps types (Docs, Sheets, Slides).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
User-visible feature shipped in this PR: native Google Apps files now
convert into decks. Per CLAUDE.md, user-visible feat: changes require
a changelog entry in the same PR.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Spec was previously untracked. Adding to track before removal in next commit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Feature is implemented. Per CLAUDE.md spec lifecycle, specs are deleted
once implemented. Recoverable via git log.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Remove @jest/globals import (jest globals are available without it per project convention)
- Add rotation/rotationDegree fields to match full GoogleDriveFile type
- Use local FakeRes type to avoid express.Response generic issues
- Cast FakeRes to express.Response at call sites to satisfy handler types

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- UploadSourceChips: mark props Readonly<Props> (Sonar react/type-dependent)
- UploadSourceChips: replace role="group" div with semantic <fieldset>+<legend>;
  the "Or pick from:" prose becomes the legend, doubling as the visible label
  (Sonar accessibility/react)
- handleGoogleDrive: log error in the catch before sending 400 so failures
  are diagnosable in prod logs (Sonar error-handling)
- Update CSS to neutralize fieldset/legend default styling
- Update the test to assert on <fieldset>/<legend> instead of [role=group]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aalemayhu aalemayhu merged commit 1816c7d into main May 16, 2026
9 checks passed
@aalemayhu aalemayhu deleted the feat/google-docs-export branch May 16, 2026 12:11
@sonarqubecloud
Copy link
Copy Markdown

aalemayhu added a commit that referenced this pull request May 16, 2026
#2310)

## What

Three bugs in `handleGoogleDrive.ts` that combined to break the new
native-Doc export path shipped in #2309. Mirrors the working
`handleDropbox.ts` pattern.

| Was | Now |
|---|---|
| `responseType: 'blob'` | `responseType: 'arraybuffer'` |
| `buffer: contents.data` | `buffer: Buffer.from(contents.data)` |
| `size: file.sizeBytes` | `size: buffer.length` |

## Why

A user picked a Google Doc named "2anki" after #2309 deployed. The
server returned 400 with `"2anki.html" appears to be empty. Please
re-export your file and try again.`

Root cause: the picker reports `sizeBytes: 0` for native Google Docs
(they have no fixed byte size). `getUploadValidationError` checks
`file.size === 0` and rejects before the parser sees the body. Even if
size had been right, `responseType: 'blob'` is not valid in Node axios —
it would have coerced the `text/html` body to a string, and the parser
would have read garbage.

Pre-existing binary Drive picks worked because picker `sizeBytes` is
non-zero for binary files. The `'blob'` typo silently worked-enough for
binary content too.

## How

Match the Dropbox pattern exactly. `Buffer.from(arrayBuffer)` produces a
real Node Buffer regardless of response Content-Type; `buffer.length` is
correct for all paths.

## Testing

- `handleGoogleDrive.test.ts` (now 7 tests, was 5):
- New: zero-`sizeBytes` native Doc produces non-zero size + real Buffer
on the downstream file.
  - New: asserts `responseType: 'arraybuffer'` is in the request config.
  - Existing 5 tests still pass with no edits.

## Risks

- The same axios call is used for the pre-existing binary path.
`Buffer.from(ArrayBuffer)` is a safe wrap for any binary content — no
regression risk for PDFs etc.
- No changelog entry: the entry from #2309 ("Google Docs, Sheets, and
Slides from your Drive turn straight into decks") becomes accurate once
this lands.

## Goal alignment

Simpler/faster/more beautiful: yes — the "appears to be empty" error was
the opposite of "drop something in, get a clean deck back". Toward
scale: yes — Google Docs is the largest pool of cloud-native notes after
PDFs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- codesmith:footer -->
---
<a
href="https://app.blacksmith.sh/2anki/codesmith/server/pr/2310"><picture><source
media="(prefers-color-scheme: dark)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"><source
media="(prefers-color-scheme: light)"
srcset="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-light.svg"><img
alt="View in Codesmith"
src="https://pr-comments-assets.blacksmith.sh/codesmith/view-in-codesmith-dark.svg"></picture></a>
<sup>Need help on this PR? Tag <code>@codesmith</code> with what you
need.</sup>

- [ ] Let Codesmith autofix CI failures and bot reviews
<!-- /codesmith:footer -->

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant