Skip to content

v0.2.0: Scoring engine, CAFA evaluation framework and UI overhaul#6

Merged
frapercan merged 18 commits intomainfrom
develop
Mar 16, 2026
Merged

v0.2.0: Scoring engine, CAFA evaluation framework and UI overhaul#6
frapercan merged 18 commits intomainfrom
develop

Conversation

@frapercan
Copy link
Owner

Summary

  • Scoring engine: configurable multi-signal scoring combining embedding similarity, NW/SW alignment, taxonomic proximity and per-evidence-code quality weights (ScoringConfig model + /scoring router)
  • CAFA evaluation framework: full NK/LK/PK evaluation protocol with IA-weighted metrics, artifact ZIP download, scoring config integration and result management
  • New UI pages: /scoring (scoring config CRUD + presets) and /support (contact form)
  • Evaluation page overhaul: per-result metrics table (NK/LK/PK × BPO/MFO/CCO), job polling, IA-url management on ontology snapshots
  • ORM + migrations: ScoringConfig, SupportEntry, ia_url on OntologySnapshot, 6 new Alembic migrations
  • Fixes: NaN in limit-per-entry input; DELETE 204 response parsed as JSON

Commits

  • feat(orm): ScoringConfig, SupportEntry + migrations
  • feat(core): scoring engine + CAFA evaluation pipeline
  • feat(api): scoring and support routers
  • feat(frontend): scoring UI, support page, evaluation overhaul
  • docs: evaluation architecture + .gitignore ref_cache
  • chore: bump version to 0.2.0

Test plan

  • poetry run pytest passes
  • alembic upgrade head applies cleanly
  • Create ScoringConfig preset → run evaluation → Fmax table renders
  • Delete evaluation result → no JSON parse error

frapercan and others added 16 commits March 14, 2026 22:29
- Add 'migrate' one-shot service that runs alembic upgrade head before API starts
- Mount docker/init.sql to postgres initdb.d to enable pgvector extension automatically
- All workers and API depend on migrate completing successfully
- Add docker-compose.prod.yml: uses pre-built ghcr.io images, GPU support for worker-embeddings
- Rewrite deploy_vast.sh: sync compose files + docker compose pull + up (no rsync of source code)
- Migrations run automatically via the migrate service on every deploy
- ScoringConfig: reproducible scoring recipe with signal weights,
  formula type and optional per-evidence-code quality overrides
- EvaluationResult: add scoring_config_id FK and results JSONB column
- OntologySnapshot: add ia_url field for Information Accretion file
- SupportEntry: new model for user-facing support/contact entries
- 6 new Alembic migrations covering all model additions and indexes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scoring.py: configurable multi-signal scoring engine combining
  embedding similarity, NW/SW alignment identity, taxonomic proximity
  and per-evidence-code quality weights; supports weighted_avg and
  evidence_weighted formulas
- metrics.py: shared metric computation utilities
- evaluation.py: propagate scoring_config through evaluation runs,
  store per-namespace Fmax/precision/recall/coverage in results JSONB
- run_cafa_evaluation: download IA file from OntologySnapshot.ia_url,
  apply scoring config to generate CAFA-format prediction scores
- predict_go_terms: wire scoring_config_id into prediction batch payload

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scoring.py: CRUD for ScoringConfig + preset factory + scored TSV
  streaming endpoint (/scoring/prediction-sets/{id}/score.tsv)
- support.py: contact/support entry submission endpoint
- annotations.py: evaluation-set DELETE cascade, result DELETE,
  artifacts ZIP download, IA-url PATCH on snapshots, ground-truth
  FASTA and TSV download endpoints
- embeddings.py: propagate scoring_config_id in prediction launch
- app.py: register scoring and support routers
- base_worker.py: surface scoring_config_id in worker dispatch
…verhaul

- New /scoring page: create/delete ScoringConfig with signal weights,
  evidence overrides, formula selector and preset loader
- New /support page: contact form backed by support router
- SupportButton + UsagePolicyModal components
- evaluation/page.tsx: full rewrite — per-result metrics table (NK/LK/PK
  × BPO/MFO/CCO), scoring config selector, artifact ZIP download,
  result polling after job submit, IA-url management on snapshots
- functional-annotation: scoring config selector in predict form,
  enriched per-protein prediction detail view
- annotations/page.tsx: IA-url PATCH UI on ontology snapshots
- NavLinks: add Scoring and Support entries
- lib/api.ts: ScoringConfig CRUD, scored TSV URL helper, IA-url PATCH,
  evaluation result DELETE and artifact endpoints
- fix: NaN in limit-per-entry input (guard parseInt)
- fix: DELETE 204 response parsed as JSON (use fetch directly)
- evaluation.rst: full CAFA evaluation protocol documentation
  (NK/LK/PK categories, scoring config integration, data model)
- operations.rst: document run_cafa_evaluation and scoring engine
- index.rst, core.rst: include new modules in autodoc
- .gitignore: exclude data/ref_cache/ (large .npy files) and test-results

chore: add LICENSE, IA benchmark data, Playwright e2e scaffolding
Add test_metrics.py (15 tests) covering PRPoint, CAFAMetrics.summary,
compute_cafa_metrics validation and correctness to raise test coverage
above the 65% CI threshold.
Add 5 new test files covering:
- scoring engine (evidence_weight, compute_score, score_predictions)
- CAFA evaluation dataclass and _get_descendants BFS
- GenerateEvaluationSetOperation payload validation and execute path
- scoring CRUD router (configs, presets, 404 preflight checks)
- support and maintenance routers (GET/POST support, vacuum sequences/embeddings)

Extend test_core.py with normalize, is_experimental and RetryLaterError tests.
Total: 399 passing, 3 skipped.
Join EmbeddingConfig, AnnotationSet and OntologySnapshot in the
list_prediction_sets query so each row returns model_name,
annotation source+version and OBO version instead of raw UUIDs.

Frontend consumes the new optional fields and falls back to shortId
when they are absent.
Workers are now started with setsid so each gets its own process group.
Stop/force-kill targets the whole group (kill -- -PID) to prevent
orphaned child processes. Grace period reduced from 60s to 5s.
Clean up unused imports and unsorted import blocks flagged by ruff
in scoring.py, metrics.py, core/scoring.py and orm models __init__.
Run ruff format on all Python and frontend package files.
No logic changes — pure whitespace, line-length and import ordering.
@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 47.80335% with 499 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.01%. Comparing base (e440a7f) to head (2eeb474).
⚠️ Report is 19 commits behind head on main.

Files with missing lines Patch % Lines
protea/core/operations/predict_go_terms.py 11.36% 234 Missing ⚠️
protea/api/routers/annotations.py 14.77% 75 Missing ⚠️
protea/core/operations/run_cafa_evaluation.py 8.92% 51 Missing ⚠️
protea/core/evaluation.py 22.22% 42 Missing ⚠️
protea/api/routers/scoring.py 78.26% 30 Missing ⚠️
protea/core/operations/compute_embeddings.py 52.77% 17 Missing ⚠️
protea/core/operations/load_quickgo_annotations.py 36.84% 12 Missing ⚠️
protea/api/routers/proteins.py 0.00% 8 Missing ⚠️
protea/core/operations/load_ontology_snapshot.py 41.66% 7 Missing ⚠️
protea/workers/base_worker.py 50.00% 6 Missing ⚠️
... and 8 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main       #6      +/-   ##
==========================================
- Coverage   65.81%   65.01%   -0.81%     
==========================================
  Files          49       55       +6     
  Lines        3873     4550     +677     
==========================================
+ Hits         2549     2958     +409     
- Misses       1324     1592     +268     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Migrate all pages from app/ to app/[locale]/ (next-intl App Router)
- Add middleware, i18n/routing.ts, i18n/request.ts for locale detection
- Add LanguageSwitcher component with locale toggle
- Add message files for en, es, de, pt, zh (~21–23 KB each)
- Translate all hardcoded UI strings across 14 pages + 6 components:
  proteins, proteinDetail, annotations, embeddings, functionalAnnotation,
  evaluation, scoring, querySets, jobs, jobDetail, maintenance, support,
  statusBadge, eventTimeline, resetDbButton, supportButton, usagePolicyModal
- Remove dead app/layout.tsx and old app/ page stubs
- README: clarify Docker not yet validated; mark source install as recommended
@frapercan frapercan merged commit 0b52969 into main Mar 16, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant