v0.2.0: Scoring engine, CAFA evaluation framework and UI overhaul by frapercan · Pull Request #6 · frapercan/PROTEA

frapercan · 2026-03-16T21:14:26Z

Summary

Scoring engine: configurable multi-signal scoring combining embedding similarity, NW/SW alignment, taxonomic proximity and per-evidence-code quality weights (ScoringConfig model + /scoring router)
CAFA evaluation framework: full NK/LK/PK evaluation protocol with IA-weighted metrics, artifact ZIP download, scoring config integration and result management
New UI pages: /scoring (scoring config CRUD + presets) and /support (contact form)
Evaluation page overhaul: per-result metrics table (NK/LK/PK × BPO/MFO/CCO), job polling, IA-url management on ontology snapshots
ORM + migrations: ScoringConfig, SupportEntry, ia_url on OntologySnapshot, 6 new Alembic migrations
Fixes: NaN in limit-per-entry input; DELETE 204 response parsed as JSON

Commits

feat(orm): ScoringConfig, SupportEntry + migrations
feat(core): scoring engine + CAFA evaluation pipeline
feat(api): scoring and support routers
feat(frontend): scoring UI, support page, evaluation overhaul
docs: evaluation architecture + .gitignore ref_cache
chore: bump version to 0.2.0

Test plan

poetry run pytest passes
alembic upgrade head applies cleanly
Create ScoringConfig preset → run evaluation → Fmax table renders
Delete evaluation result → no JSON parse error

- Add 'migrate' one-shot service that runs alembic upgrade head before API starts - Mount docker/init.sql to postgres initdb.d to enable pgvector extension automatically - All workers and API depend on migrate completing successfully

- Add docker-compose.prod.yml: uses pre-built ghcr.io images, GPU support for worker-embeddings - Rewrite deploy_vast.sh: sync compose files + docker compose pull + up (no rsync of source code) - Migrations run automatically via the migrate service on every deploy

- ScoringConfig: reproducible scoring recipe with signal weights, formula type and optional per-evidence-code quality overrides - EvaluationResult: add scoring_config_id FK and results JSONB column - OntologySnapshot: add ia_url field for Information Accretion file - SupportEntry: new model for user-facing support/contact entries - 6 new Alembic migrations covering all model additions and indexes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- scoring.py: configurable multi-signal scoring engine combining embedding similarity, NW/SW alignment identity, taxonomic proximity and per-evidence-code quality weights; supports weighted_avg and evidence_weighted formulas - metrics.py: shared metric computation utilities - evaluation.py: propagate scoring_config through evaluation runs, store per-namespace Fmax/precision/recall/coverage in results JSONB - run_cafa_evaluation: download IA file from OntologySnapshot.ia_url, apply scoring config to generate CAFA-format prediction scores - predict_go_terms: wire scoring_config_id into prediction batch payload Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- scoring.py: CRUD for ScoringConfig + preset factory + scored TSV streaming endpoint (/scoring/prediction-sets/{id}/score.tsv) - support.py: contact/support entry submission endpoint - annotations.py: evaluation-set DELETE cascade, result DELETE, artifacts ZIP download, IA-url PATCH on snapshots, ground-truth FASTA and TSV download endpoints - embeddings.py: propagate scoring_config_id in prediction launch - app.py: register scoring and support routers - base_worker.py: surface scoring_config_id in worker dispatch

…verhaul - New /scoring page: create/delete ScoringConfig with signal weights, evidence overrides, formula selector and preset loader - New /support page: contact form backed by support router - SupportButton + UsagePolicyModal components - evaluation/page.tsx: full rewrite — per-result metrics table (NK/LK/PK × BPO/MFO/CCO), scoring config selector, artifact ZIP download, result polling after job submit, IA-url management on snapshots - functional-annotation: scoring config selector in predict form, enriched per-protein prediction detail view - annotations/page.tsx: IA-url PATCH UI on ontology snapshots - NavLinks: add Scoring and Support entries - lib/api.ts: ScoringConfig CRUD, scored TSV URL helper, IA-url PATCH, evaluation result DELETE and artifact endpoints - fix: NaN in limit-per-entry input (guard parseInt) - fix: DELETE 204 response parsed as JSON (use fetch directly)

- evaluation.rst: full CAFA evaluation protocol documentation (NK/LK/PK categories, scoring config integration, data model) - operations.rst: document run_cafa_evaluation and scoring engine - index.rst, core.rst: include new modules in autodoc - .gitignore: exclude data/ref_cache/ (large .npy files) and test-results chore: add LICENSE, IA benchmark data, Playwright e2e scaffolding

Add test_metrics.py (15 tests) covering PRPoint, CAFAMetrics.summary, compute_cafa_metrics validation and correctness to raise test coverage above the 65% CI threshold.

Add 5 new test files covering: - scoring engine (evidence_weight, compute_score, score_predictions) - CAFA evaluation dataclass and _get_descendants BFS - GenerateEvaluationSetOperation payload validation and execute path - scoring CRUD router (configs, presets, 404 preflight checks) - support and maintenance routers (GET/POST support, vacuum sequences/embeddings) Extend test_core.py with normalize, is_experimental and RetryLaterError tests. Total: 399 passing, 3 skipped.

Join EmbeddingConfig, AnnotationSet and OntologySnapshot in the list_prediction_sets query so each row returns model_name, annotation source+version and OBO version instead of raw UUIDs. Frontend consumes the new optional fields and falls back to shortId when they are absent.

Workers are now started with setsid so each gets its own process group. Stop/force-kill targets the whole group (kill -- -PID) to prevent orphaned child processes. Grace period reduced from 60s to 5s.

Clean up unused imports and unsorted import blocks flagged by ruff in scoring.py, metrics.py, core/scoring.py and orm models __init__.

Run ruff format on all Python and frontend package files. No logic changes — pure whitespace, line-length and import ordering.

codecov · 2026-03-16T22:46:52Z

Codecov Report

❌ Patch coverage is 47.80335% with 499 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.01%. Comparing base (e440a7f) to head (2eeb474).
⚠️ Report is 19 commits behind head on main.

Files with missing lines	Patch %	Lines
protea/core/operations/predict_go_terms.py	11.36%	234 Missing ⚠️
protea/api/routers/annotations.py	14.77%	75 Missing ⚠️
protea/core/operations/run_cafa_evaluation.py	8.92%	51 Missing ⚠️
protea/core/evaluation.py	22.22%	42 Missing ⚠️
protea/api/routers/scoring.py	78.26%	30 Missing ⚠️
protea/core/operations/compute_embeddings.py	52.77%	17 Missing ⚠️
protea/core/operations/load_quickgo_annotations.py	36.84%	12 Missing ⚠️
protea/api/routers/proteins.py	0.00%	8 Missing ⚠️
protea/core/operations/load_ontology_snapshot.py	41.66%	7 Missing ⚠️
protea/workers/base_worker.py	50.00%	6 Missing ⚠️
... and 8 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main       #6      +/-   ##
==========================================
- Coverage   65.81%   65.01%   -0.81%     
==========================================
  Files          49       55       +6     
  Lines        3873     4550     +677     
==========================================
+ Hits         2549     2958     +409     
- Misses       1324     1592     +268

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Migrate all pages from app/ to app/[locale]/ (next-intl App Router) - Add middleware, i18n/routing.ts, i18n/request.ts for locale detection - Add LanguageSwitcher component with locale toggle - Add message files for en, es, de, pt, zh (~21–23 KB each) - Translate all hardcoded UI strings across 14 pages + 6 components: proteins, proteinDetail, annotations, embeddings, functionalAnnotation, evaluation, scoring, querySets, jobs, jobDetail, maintenance, support, statusBadge, eventTimeline, resetDbButton, supportButton, usagePolicyModal - Remove dead app/layout.tsx and old app/ page stubs - README: clarify Docker not yet validated; mark source install as recommended

frapercan and others added 16 commits March 14, 2026 22:29

chore: commit pending scripts, migrations and favicon

4c53158

docs: add quality baseline assessment (2026-03-14)

f15ae79

Merge branch 'main' into develop

5ab9688

fix(metrics): replace np.trapz with np.trapezoid for NumPy 2.0 compat

e626cd4

Add test_metrics.py (15 tests) covering PRPoint, CAFAMetrics.summary, compute_cafa_metrics validation and correctness to raise test coverage above the 65% CI threshold.

fix(scripts): use setsid + process-group kill in manage.sh

3673c15

Workers are now started with setsid so each gets its own process group. Stop/force-kill targets the whole group (kill -- -PID) to prevent orphaned child processes. Grace period reduced from 60s to 5s.

fix(lint): apply ruff fixes — remove unused imports, sort import blocks

dd3f821

Clean up unused imports and unsorted import blocks flagged by ruff in scoring.py, metrics.py, core/scoring.py and orm models __init__.

style: apply ruff format across entire codebase

c00ca3a

Run ruff format on all Python and frontend package files. No logic changes — pure whitespace, line-length and import ordering.

frapercan added 2 commits March 17, 2026 00:09

docs: reword CAFA reference — series, not specific edition

2eeb474

frapercan merged commit 0b52969 into main Mar 16, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.0: Scoring engine, CAFA evaluation framework and UI overhaul#6

v0.2.0: Scoring engine, CAFA evaluation framework and UI overhaul#6
frapercan merged 18 commits intomainfrom
develop

frapercan commented Mar 16, 2026

Uh oh!

codecov bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frapercan commented Mar 16, 2026

Summary

Commits

Test plan

Uh oh!

codecov bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Mar 16, 2026 •

edited

Loading