Skip to content

test(polyfill): vendor the WICG polyfill test corpus as a second conformance vector#18

Merged
chad-loder merged 6 commits into
mainfrom
ci/polyfill-corpus
May 13, 2026
Merged

test(polyfill): vendor the WICG polyfill test corpus as a second conformance vector#18
chad-loder merged 6 commits into
mainfrom
ci/polyfill-corpus

Conversation

@chad-loder
Copy link
Copy Markdown
Owner

Summary

Adds a parallel cross-implementation conformance net beyond the upstream WPT corpus. The WICG urlpattern-polyfill bundles its own snapshot of urlpatterntestdata.json (336 entries, ~85 KB); running every entry against yarlpattern gives redundant coverage of the ~328 shared cases and flags divergences from polyfill expectations on the 8 entries it carries that aren't in upstream WPT.

Test counts

Before After
passed 580 908 (+328 polyfill)
skipped 19 27 (+8 tracked divergences)

Changes

File Purpose
scripts/fetch_polyfill_corpus.sh Mirrors fetch_wpt_corpus.sh byte-for-byte. Pinned SHA f147a0f4..., HTTPS-only sparse-checkout of test/, post-fetch SHA verify, per-file size cap, JSON shape check, --verify mode
tests/conftest.py Adds load_polyfill_cases(), polyfill_data_path (env-overrideable via URLPATTERN_POLYFILL_DATA), parametrizes a new polyfill_case fixture. The 8 polyfill-vs-WHATWG-spec divergences get pytest.mark.skip(reason="tracked divergence")
tests/test_polyfill.py Thin shim that delegates to the existing test_wpt_case driver. Same data shape, no duplicated logic
pyproject.toml Registers wpt and polyfill markers, suppressing the longstanding PytestUnknownMarkWarning
.github/workflows/ci.yml Extends wpt-corpus job to also fetch + cache polyfill. Artifact contains both reference/wpt/ and reference/polyfill/. Matrix downloads switch to workspace-root extract

Test plan

  • just lint clean (11 / 11)
  • pytest tests/test_polyfill.py → 328 passed, 8 skipped (the tracked divergences)
  • Full suite: 908 passed, 27 skipped
  • scripts/fetch_polyfill_corpus.sh smoke-tested locally (fresh clone + --verify)
  • actionlint + zizmor clean on the workflow update
  • CI on this PR stays green (the new wpt-corpus job pre-flight should hit cache misses initially)

…d conformance vector

Adds a parallel cross-implementation conformance net beyond the upstream
WPT corpus. The WICG urlpattern-polyfill bundles its own snapshot of
urlpatterntestdata.json (336 entries, ~85 KB); running every entry of
that snapshot against yarlpattern gives us redundant coverage of the
~328 shared cases and flags any divergence from the polyfill's
expectations on the 8 entries it carries that aren't in upstream WPT.

Implementation:

* scripts/fetch_polyfill_corpus.sh — new fetcher mirroring the
  fetch_wpt_corpus.sh security model byte-for-byte:
    - pinned SHA (f147a0f4..., 2025-05-07) — matches the dev-side pin
      in scripts/fetch_references.sh
    - HTTPS-only sparse-checkout of just test/
    - filter=blob:none clone, post-fetch git rev-parse HEAD verify
    - per-file size cap (10 MiB), JSON well-formedness + shape check,
      --verify mode for restored caches

* tests/conftest.py — adds load_polyfill_cases() + polyfill_data_path
  (overrideable via URLPATTERN_POLYFILL_DATA), parametrizes a new
  polyfill_case fixture across all 336 entries. The 8 entries
  where the polyfill expects a constructor error but the current
  WHATWG spec does not (e.g. {hostname: 'bad#hostname'} — the
  spec accepts these with Chromium-style truncation) are wrapped in
  pytest.mark.skip with an explicit tracked divergence
  reason. Skipping the divergences is deliberate, not failure-hiding;
  the count surfaces in pytest output.

* tests/test_polyfill.py — thin shim that delegates to the existing
  test_wpt_case driver. The data shape is identical, so the driver
  reuses without modification.

* pyproject.toml — registers polyfill and wpt markers under
  [tool.pytest.ini_options]. Suppresses the longstanding
  PytestUnknownMarkWarning the WPT suite was already triggering.

* .github/workflows/ci.yml — extends the wpt-corpus job to also
  fetch and cache the polyfill data. The artifact (renamed
  wpt-corpus but containing both reference/wpt and
  reference/polyfill) excludes both .git directories from
  upload to keep the Windows digest check happy. Matrix-job
  downloads switch to path: . (workspace root) so each
  reference/<...> subtree lands where the harness expects it.

Test counts:

    Before: 580 passed, 19 skipped
    After:  908 passed, 27 skipped
              ↑ +328 polyfill cases passing
                +8 polyfill-vs-WHATWG-spec divergences skipped
chad-loder and others added 2 commits May 12, 2026 19:41
…e root

upload-artifact with multiple paths under a common ancestor strips that
ancestor from the archive root. reference/wpt and
reference/polyfill share reference/ as the common ancestor, so
the artifact internally stores files as wpt/... and polyfill/...
(not reference/wpt/...).

The matrix download was extracting with path: . which dropped files
at workspace root → wpt/urlpattern/... instead of
reference/wpt/urlpattern/.... Tests fail-fast on missing fixtures
(per the v0.1 design), so all 10 matrix shards aborted at collection.

Fix: extract under reference/ so the archive's wpt/... /
polyfill/... lands at reference/wpt/... / reference/polyfill/...
where conftest.py looks for them. test-prospective already worked
because it runs from a fresh checkout (not the sdist + rm src/ flow).
@chad-loder chad-loder enabled auto-merge (squash) May 13, 2026 03:41
@chad-loder chad-loder merged commit aa6b527 into main May 13, 2026
19 checks passed
@chad-loder chad-loder deleted the ci/polyfill-corpus branch May 13, 2026 03:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant