Feat/fasta prep backend#248
Draft
t03i wants to merge 30 commits into
Draft
Conversation
Also fixes a pre-existing knip failure caused by playwright@1.57.0 crashing under jiti when the Playwright plugin loaded app/tests/ playwright.config.ts. Disable the plugin for the app workspace — specs are already captured via the entry glob — and add a comment explaining why.
…ment Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…SE event queues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds pipeline.py with run_protspace_prepare(), which launches the protspace CLI as an async subprocess, parses stderr for stage transitions (embedding, projecting, annotating, bundling), enforces a configurable timeout, and raises PipelineFailure on non-zero exit or missing bundle output. Also provides cleanup_job_dir() for the TTL sweeper. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds api.py with POST /api/prepare, GET /api/prepare/{id}/events (SSE),
and GET /api/prepare/{id}/bundle. Updates app.py to accept an injectable
pipeline, fixes late-subscriber path in jobs.py to always synthesize a
queued event before replaying the terminal event, and hardens conftest.py
to set PREP_JOB_ROOT before module-level create_app() runs.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements isFastaFile helper and prepareFastaBundle that uploads a FASTA file, streams SSE progress events, and resolves with a .parquetbundle File. Uses @public JSDoc tags on FastaPrepStage/FastaPrepOptions so knip recognises them as intentional public API ahead of Task 10 wiring. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndle path intact Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… races) - Fix 1: sanitize Content-Disposition filename via _safe_download_name() to prevent header injection from hostile original_name values - Fix 2: add SSE keep-alive comment frames every _KEEPALIVE_INTERVAL_SECONDS (15 s default, monkeypatchable) using asyncio.wait_for on the subscribe iter - Fix 3: register subscriber queue BEFORE yielding the synthetic queued event so terminal events published during the yield cannot be missed - Fix 4: send None sentinel to all live subscriber queues in sweep_expired() before popping _subscribers, preventing indefinite hangs - Fix 5: catch asyncio.CancelledError in _run(), publish error event, set ERROR status, then re-raise so cancellation propagates cleanly - Fix 6: use peek_bundle/mark_consumed split so consumed flag is only set after a successful path.read_bytes(); OSError surfaces as HTTP 500 - Fix 7: register atexit handler in conftest.py to clean up the mkdtemp dir after the test session (previously leaked on every run) New tests: 14 added (47 total, was 33). All pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract abort handler to a named function so cleanup() can call removeEventListener, preventing listener accumulation when the same AbortController is reused across multiple prepareFastaBundle calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous keepalive loop wrapped each `aiter.__anext__()` in `asyncio.wait_for`, which cancels the inner coroutine on timeout. That cancellation exhausted the underlying async generator, so the first keepalive frame silently truncated the stream and the EventSource client fired an error event surfacing as "Bundle preparation failed." Hold a single in-flight `__anext__()` task across keepalive ticks via `asyncio.shield`, only creating a new one once an event has been delivered. Regression test now asserts the stream keeps flowing past the keepalive boundary and still delivers `event: done`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single `protspace prepare` invocation with explicit calls to `protspace embed`, `annotate`, `project`, and `bundle`. Embed (Biocentral) and annotate (UniProt) are network-bound and independent, so they run concurrently inside an `asyncio.TaskGroup`; project and bundle run sequentially afterward. The whole run shares a single wall-clock budget via `asyncio.timeout` so the SSE contract still has a deterministic upper bound. Stage events are now driven by the pipeline orchestrator rather than parsed out of stderr, so the regex-based stage detector is gone. Each step's stderr is still drained line-by-line (last 50 lines kept for failure messages) so subprocesses never block on a full pipe, and cancellation kills the subprocess before propagating. Tests cover the success path, parallel execution of embed+annotate, per-step failure surfaces, the missing-bundle and missing-H5 sentinels, and timeout-driven subprocess kill. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the prod topology where the SPA and prep backend are hosted on separate origins. The new compose service builds a custom Caddy image with caddy-ratelimit baked in, fronts protspace-prep on http://localhost:9090, and applies CORS headers, an OPTIONS preflight short-circuit, a 9 MB submit body cap, and a 5-per-15min submit rate limit so dev behavior matches what users will hit in prod. Also adds PREP_SEQUENCE_MIN_COUNT=20 to the prep service env so the floor enforced by the validator is configured at the deployment layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a consumer-supplied loadFromFileHandler rejected, the rejection propagated to the caller but the data-loader never updated its `error` property or fired `data-error`, so listeners (the explore runtime in particular) had no signal to drop the loading overlay. Catch handler errors at the boundary, set `this.error`, and dispatch `data-error` with the original Error so existing listeners can branch on `originalError.name === 'AbortError'` cleanly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prep submit path returned bare "Upload failed (HTTP 429)" messages on rate-limit, oversize, and backend-unavailable responses. Map 429, 413, 503, and 504 to user-readable strings, parse Retry-After (seconds or HTTP date) into a "try again in N minutes" hint, and fall back to a generic but still helpful message when the header is missing or the body is non-JSON (e.g. Caddy's plain-text 429). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires an AbortController through prepareFastaBundle and renders a Cancel button on the loading overlay while the prep job runs. The dataset controller's data-error handler now special-cases AbortError so a user cancel resolves the load queue cleanly instead of surfacing as a toast/error UI. The button is removed once the bundle handoff completes or the prep call rejects. The runtime now also reads VITE_PREP_API_BASE so the SPA can target a separate backend origin (the new Caddy in front of protspace-prep) in both dev and prod, falling back to same-origin when unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ject Splits the mocked end-to-end into two focused tests: one that exercises the new Cancel button (asserting the bundle is never fetched and the overlay closes) and one that completes the prep flow against the mocked backend. Adds a `fasta-prep-live` playwright project that drives a real Caddy + protspace-prep + Biocentral round-trip using a small fixture FASTA, with a 6-minute timeout for cold starts. Playwright baseURL now reads PLAYWRIGHT_BASE_URL so the live project can target the dev origin (default localhost:8080) without editing the config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Type-check, knip, and the docs build still run on every commit; the test suite now runs in CI only. Keeping `test:ci` in the precommit hook made every commit a multi-minute wait, which encouraged --no-verify detours that defeat the rest of the gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
embed/project keyed projections_data.identifier by the raw FASTA header (sp|P12345|NAME_HUMAN) while annotate ran the same header through parse_identifier (P12345). The frontend bundle join in data-loader/utils/bundle.ts joins on projection.identifier, so for any UniProt FASTA every lookup missed and annotations silently dropped. Run both subprocesses against an input.normalized.fasta whose headers are already passed through protspace's parse_identifier, so both downstream tables agree on a single key. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Progress was hard-coded to 25% for every onProgress event, so the bar froze for the entire pipeline. Map each stage (queued/embedding/ annotating/projecting/bundling) to its own percentage and clamp with Math.max so out-of-order events can never roll the bar backwards. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
1 min is currently absolutely unrealistic. For 840 sequences, we're breaching 3:30 likely because of Biocentral. Some more profiling is required. |
Backend: - Stream bundle via FileResponse + BackgroundTask (no full read into memory) - ExceptionGroup handling joins all PipelineFailure messages instead of dropping siblings - Switch JobState timestamps to time.time() to match sweep's mtime check - FastaValidationError exception handler collapses three 400 blocks - Drop dead consume_bundle, cleanup_job_dir, and # Fix N: markers - Extract _force_put helper for the queue drop-oldest pattern - functools.partial replaces _default_pipeline closure - Misc: BOM escape, named nucleotide threshold, encoding="utf-8" Caddy/Docker: - Caddyfile.example: handle_path -> handle (was 404'ing every request) - Extract (prep_backend) snippet to deduplicate dev/example - Drop duplicate HEALTHCHECK from Dockerfile (compose owns it) - Switch base image to ghcr.io/astral-sh/uv:python3.12-bookworm-slim Frontend: - loading-overlay scopes #progress-* lookups to the overlay element Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add e2e-tests job that auto-starts the dev server via Playwright's webServer block and uploads the HTML report on failure. - Drop branches:['**'] push trigger so feature pushes don't run twice (once for push, once for the PR). - Gate fasta-prep-live behind RUN_LIVE_E2E so the default e2e run doesn't try to hit the real prep backend.
The superpowers/ subtree holds local planning/spec notes that aren't part of the user-facing docs site. Untracking + srcExclude keeps these files local without breaking the docs:build pre-commit hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… handling Frontend: - Estimate embedding time from sequence count and surface it as a sub-message. - Smooth progress with an asymptotic creep between embedding/projecting stages. - Show queue position when the job is waiting for a slot. - Display a persistent "Got a larger dataset?" overlay note linking to the Colab notebook so users have a fallback when the lab service is busy or down. - Wrap submit/SSE/download failures in a typed FastaPrepError that carries an optional server-supplied error code. Backend: - Tag the queued event with queue_position and running counts so the UI can show "Position N in queue" instead of a blank wait. - Propagate an optional code on PipelineFailure into the SSE error payload. - Classify Biocentral connection / 503 failures as BIOCENTRAL_UNAVAILABLE with a friendlier user-facing message that points at the Colab fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- docker-compose.prod.yml: pin protspace-prep + caddy-ratelimit to GHCR images via PREP_TAG / CADDY_TAG, override compose to listen on 0.0.0.0:8080, and pass CORS_ALLOWED_ORIGIN through to Caddy. - config/Caddyfile.prod: rate limit + 9 MB body cap on POST /api/prepare, CORS for the configured SPA origin, /healthz endpoint. The lab edge gateway terminates TLS upstream; this Caddy listens on plain HTTP. - scripts/deploy-vm.sh + update-vm.sh: first-time deploy and routine update helpers driven by .env. - .github/workflows/publish-images.yml: build and push protspace-prep and caddy-ratelimit images to GHCR on main, tags, and PRs touching the prep service or Caddy Dockerfile. - Split Playwright e2e off the main CI workflow into a scheduled + label-gated workflow (run-e2e label or manual dispatch) so PR CI stays fast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Collaborator
Author
|
Changed e2e to only run nightly/with label. |
Owner
Review notesLikely bugs
Colab fallback UX
Dead / inconsistent config
Worth a TODO, not a blocker
Smaller
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TLDR
MVP backend implementation.
Description
Architecture
SLO:
Closes: #236