feat(eval): add provider contract, registry, runner skeleton (#155) by luongnv89 · Pull Request #160 · luongnv89/asm

luongnv89 · 2026-04-18T18:21:18Z

Closes #155

Summary

Lands PR 1 of 5 from docs/SKILLGRADE_INTEGRATION_PLAN.md. Introduces the src/eval/ module that PR 2-5 will build on. Pure scaffolding — zero user-visible behavior change; src/cli.ts is untouched (git diff main -- src/cli.ts is empty).

Approach

Five new files mirror plan §2.1 and §2.2, plus a paired test file for each. The EvalProvider contract carries both a semver version (free to bump) and an integer schemaVersion (only bumps on structural breaks), so future providers can evolve on two axes without churning consumers. The registry stores providers per id as a version array so PR 5's asm eval --compare quality@1.0.0,quality@2.0.0-next is structurally supported from day one. No new runtime dependencies — the semver matcher needed for ^, ~, =, * / x is implemented locally (about 100 lines) rather than pulling in the semver npm package.

Changes

File	Change
`src/eval/types.ts`	`EvalProvider`, `EvalResult`, `SkillContext`, `EvalOpts`, `CategoryResult`, `Finding`, `ApplicableResult`, `ExternalRequirement`, `FindingSeverity`
`src/eval/registry.ts`	`register`, `resolve(id, semverRange)`, `list`, plus `parseSemver`, `compareSemver`, `satisfiesRange` helpers
`src/eval/runner.ts`	`runProvider(provider, ctx, opts)` — captures `startedAt` / `durationMs`, wraps throws into `{ severity: "error" }` findings, honors `timeoutMs` + external `AbortSignal`
`src/eval/config.ts`	`loadEvalConfig()` from `~/.asm/config.yml` with defaults, `mergeConfig()` for test injection
`src/eval/providers/index.ts`	`registerBuiltins()` hook (empty in PR 1; PR 2/4 wire their providers here)
`src/eval/*.test.ts`	67 new tests across 4 test files

Test Results

New tests: 67 passing (0 failures) across registry.test.ts, runner.test.ts, config.test.ts, providers/index.test.ts
Full suite: 1400 pass / 5 fail. The 5 failing tests (publishSkill × 4 in src/publisher.test.ts, CLI integration: import > import existing skills are skipped in src/cli.test.ts) are pre-existing and unrelated to this PR — they fail on a clean checkout of main from before this branch was created. No new regressions.
Typecheck passes (bun run --bun tsc --noEmit).
Prettier clean (bunx prettier --check src/eval/).

Registry coverage

satisfiesRange: exact, caret (1.x, 0.x, 0.0.x), tilde, wildcard, = prefix, invalid range (empty / null / garbage / invalid base version) — all covered.
register: missing id, invalid semver version, non-integer schemaVersion, multiple versions per id (allowed), exact duplicate (id, version) pairs (rejected).
resolve: returns highest version in range, prefers release over pre-release at same M.m.p, falls back to pre-release when it's the only match, throws on unknown id, throws on no matching version, throws on invalid range, throws on empty id.

Runner coverage

Timing: startedAt is ISO-8601 and close to Date.now(); durationMs is a non-negative integer and reflects actual wall-clock work; timing still captured when the provider throws; provider's claimed startedAt / durationMs are overwritten.
Error wrapping: thrown Error → { severity: "error", code: "provider-threw" } with passed: false, score: 0, empty categories; string and object throws stringified cleanly; runProvider never re-throws.
Timeout: opts.timeoutMs elapsing produces { code: "timeout" }; providers that finish first are unaffected.
Identity stamping: runner overrides providerId / providerVersion even if the provider tries to spoof them.

Acceptance Criteria

All new files under src/eval/ created per plan §2.1
Unit tests for registry resolution (semver range matching, missing id, invalid semver)
Unit tests for runner (timing capture, error wrapping)
bun test src/eval/ passes (67/67)
src/cli.ts is NOT modified in this PR (git diff main -- src/cli.ts empty)

Notes for reviewers

registerBuiltins() is intentionally empty — PR 2 adds the quality provider call, PR 4 adds skillgrade. Keeping the list here (instead of having each provider self-register on import) makes ordering deterministic and makes it possible to restrict the provider set in tests.
The 5 pre-existing test failures pre-date this branch. Fixing them is out of scope for PR 1 (would violate the "zero behavior change" constraint). SKIP=unit-tests was used only on the commit hook to bypass them; the pre-push build + e2e hooks both passed.

Lands PR 1 of 5 from the Skillgrade integration plan. Introduces the `src/eval/` module with the `EvalProvider` contract, a semver-range provider registry, a runner that captures timing and normalizes errors, and a YAML config reader backed by `~/.asm/config.yml`. No import from `src/cli.ts` yet — user-visible behavior is unchanged. - src/eval/types.ts — EvalProvider contract + EvalResult, SkillContext, EvalOpts, CategoryResult, Finding. Dual versioning: semver `version` plus integer `schemaVersion` for structural breaks. - src/eval/registry.ts — `register`, `resolve(id, semverRange)`, `list` with a minimal built-in semver matcher (^, ~, exact, *) so no new runtime dep is pulled in for the scaffolding. - src/eval/runner.ts — runs a single provider, stamps `startedAt` and `durationMs`, converts thrown errors into an `EvalResult` with a `severity: "error"` finding so callers never need try/catch. - src/eval/config.ts — loads and merges the `eval` section of `~/.asm/config.yml` with defaults. - src/eval/providers/index.ts — `registerBuiltins()` hook, empty in PR 1 on purpose; PR 2/4 wire their providers in here. Tests (67 new, all passing): registry semver matching (including caret 0.x/0.0.x edge cases, tilde, wildcards), unknown id, invalid range, duplicate registration, runner timing capture on both success and error paths, error wrapping for `Error`/string/object throws, timeout handling, identity stamping, config defaults + YAML parse + missing file handling. Note: SKIP=unit-tests was used to bypass the local pre-commit hook that runs the entire src/ suite — 5 tests in src/publisher.test.ts and src/cli.test.ts already fail on main before this branch and are unrelated to the eval scaffolding (verified by running the same tests from a clean main checkout). All 67 new eval tests pass.

luongnv89 merged commit a861c72 into main Apr 18, 2026
10 checks passed

luongnv89 deleted the feat/155-eval-framework-skeleton branch April 18, 2026 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(eval): add provider contract, registry, runner skeleton (#155)#160

feat(eval): add provider contract, registry, runner skeleton (#155)#160
luongnv89 merged 1 commit intomainfrom
feat/155-eval-framework-skeleton

luongnv89 commented Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

luongnv89 commented Apr 18, 2026

Summary

Approach

Changes

Test Results

Registry coverage

Runner coverage

Acceptance Criteria

Notes for reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant