From 901cc704789f673f4b24de30dd838147fb4a6290 Mon Sep 17 00:00:00 2001 From: chad-loder <26261238+chad-loder@users.noreply.github.com> Date: Tue, 12 May 2026 21:26:40 -0700 Subject: [PATCH] refactor: snake_case primary, camelCase aliases for spec-faithful methods MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit URLPattern's two IDL methods now have PEP-8 snake_case as the canonical spelling, with the WHATWG camelCase forms preserved as straight aliases. Both spellings dispatch to the same callable / property — no parallel code path. Canonical (Python) | Alias (spec / browser JS) ----------------------+-------------------------- compare_component | compareComponent has_regexp_groups | hasRegExpGroups This aligns with how Python wraps non-Python APIs almost everywhere: playwright-python uses ``wait_for_selector`` not ``waitForSelector``; boto3 uses ``create_instance`` not ``CreateInstance``; aiohttp uses ``add_route``. The clear exception is GUI bindings (PyQt, PySide, Tkinter) where the underlying C++/Tcl framework is exposed verbatim — that's not a fit for a web-spec wrapper. The camel aliases stay so code ported verbatim from the WHATWG spec, browser JS, Deno, Bun, or Cloudflare-Workers samples reads identically. They're class-level assignments to the same descriptor (staticmethod for ``compareComponent``, property for ``hasRegExpGroups``), so ``is`` identity holds: ``URLPattern.compareComponent is URLPattern.compare_component``. Updated: - src/yarlpattern/_pattern.py compare_component renamed (was compareComponent); compareComponent reintroduced as a class-level alias. hasRegExpGroups added as a class-level alias to the existing has_regexp_groups property. The noqa: N802 on the old camel definition is gone; the new aliases carry noqa: N815 (mixedCase variable warning) on the assignment. - tests/test_pattern.py, tests/test_wpt_compare.py, scripts/generate_compliance_report.py Switched call sites to compare_component (canonical). One new test, ``test_camelcase_aliases_resolve_to_same_callable_and_property``, pins the alias identity contract so a future refactor can't silently re-implement the camel name as a duplicate method. - docs/examples/*.md The four example pages that demonstrated specificity ordering now use compare_component in their snippets. Readers see the Pythonic spelling first; the alias is documented in SPEC_DEVIATIONS. - README.md, SPEC_DEVIATIONS.md Brief mention that snake is canonical and camel is the alias for verbatim-from-spec portability. SPEC_DEVIATIONS's "Method-name capitalisation" entry rewritten to reflect the dual spelling rather than the old "we keep camelCase" stance. - docs/wpt-compliance.md Regenerated; case names referenced internally by the script switched along with the rest. 959 tests pass (was 950 + 1 new alias-identity test, modulo the unchanged conformance counts). --- README.md | 9 ++++---- SPEC_DEVIATIONS.md | 20 +++++++++++------ docs/examples/index.md | 2 +- .../match-the-kserve-v2-inference-path.md | 8 +++---- .../pick-an-llm-backend-by-model-name.md | 6 ++--- ...translate-google-api-http-to-urlpattern.md | 2 +- docs/wpt-compliance.md | 2 +- scripts/generate_compliance_report.py | 10 ++++----- src/yarlpattern/_pattern.py | 22 ++++++++++++++----- tests/test_pattern.py | 22 ++++++++++++++----- tests/test_wpt_compare.py | 15 +++++++------ 11 files changed, 74 insertions(+), 44 deletions(-) diff --git a/README.md b/README.md index a8b3915..5004333 100644 --- a/README.md +++ b/README.md @@ -93,10 +93,11 @@ enforces all of them; a stdlib-only port that goes through `urllib.parse` cannot Every stable and tentative method in the WHATWG URLPattern IDL is implemented: `URLPattern(input | string, baseURL?, options?)`, `test`, `exec`, all eight component -properties, `hasRegExpGroups`, `URLPattern.compareComponent`, and the tentative -`generate(component, groups)`. See [SPEC_DEVIATIONS.md](SPEC_DEVIATIONS.md) for the -intentional Python-flavour choices (camelCase method names, the additional `with_*` -derivers, escape-helper exposure). +properties, `has_regexp_groups`, `URLPattern.compare_component`, and the tentative +`generate(component, groups)`. The IDL camelCase spellings (`hasRegExpGroups`, +`compareComponent`) are kept as aliases so code ported verbatim from the spec or +browser JS reads identically. See [SPEC_DEVIATIONS.md](SPEC_DEVIATIONS.md) for the +intentional Python-flavour choices. ## How this differs from `aiohttp.web.UrlDispatcher` diff --git a/SPEC_DEVIATIONS.md b/SPEC_DEVIATIONS.md index bba28f9..088f699 100644 --- a/SPEC_DEVIATIONS.md +++ b/SPEC_DEVIATIONS.md @@ -134,14 +134,16 @@ above what yarl itself does: The WHATWG URLPattern Standard distinguishes between the *stable* API surface (constructor, `test`, `exec`, `compareComponent`, component properties, `hasRegExpGroups`) and the *tentative* surface (`generate`). -yarlpattern's posture: +yarlpattern exposes both PEP-8 snake_case names and the IDL camelCase +spellings (see the *Method-name capitalisation* note below); the +table reports against the canonical snake form: | Surface | Status | |---|---| | Constructor + `test` + `exec` | Implemented; 100% WPT pass with `[regex]` | | Per-component getter properties | Implemented | -| `compareComponent` | Implemented; 25 / 25 WPT cases pass | -| `hasRegExpGroups` | Implemented; 55 / 55 WPT cases pass | +| `compare_component` (alias `compareComponent`) | Implemented; 25 / 25 WPT cases pass | +| `has_regexp_groups` (alias `hasRegExpGroups`) | Implemented; 55 / 55 WPT cases pass | | `generate()` (tentative spec) | Implemented; 19 / 19 WPT cases pass | ## What yarlpattern does *not* deviate on, despite Python's defaults @@ -155,10 +157,14 @@ a spec deviation, and yarlpattern goes out of its way to match WHATWG: `host`, `path`, `query`, `fragment`). Cross-runtime portability with browser-side JS `URL` and `URLPattern` is preserved by construction. -- **Method-name capitalisation**: `compareComponent` and - `hasRegExpGroups` keep their WHATWG IDL camelCase names. This - is intentional Python-PEP-8 deviation in favour of literal-text - compatibility with the spec and with cross-language patterns. +- **Method-name capitalisation**: the canonical names are PEP 8 + snake_case (`compare_component`, `has_regexp_groups`), and the + WHATWG IDL camelCase forms (`compareComponent`, `hasRegExpGroups`) + are exposed as aliases that dispatch to the same callable / property + — no extra logic, no separate code path. Snake is what readers + should reach for in new Python code; the camel aliases exist so + code ported verbatim from the spec, browser JS, Deno, Bun, or + Cloudflare Workers reads identically. - **Result shape**: `URLPatternResult` mirrors the JS-side shape exactly: `result.` is a dict with `'input'` and `'groups'` keys; attribute access on a Pythonic `result..groups` diff --git a/docs/examples/index.md b/docs/examples/index.md index d8d369a..5573735 100644 --- a/docs/examples/index.md +++ b/docs/examples/index.md @@ -39,4 +39,4 @@ Every example follows the same shape: 3. **With URLPattern** — one declarative pattern, structured match result. 4. **What you get for free** — which URLPattern feature carried the weight (cross-component matching, optional segments, named groups with regex, - `compareComponent()`, custom-scheme support, …). + `compare_component()`, custom-scheme support, …). diff --git a/docs/examples/match-the-kserve-v2-inference-path.md b/docs/examples/match-the-kserve-v2-inference-path.md index d82b970..3afea4a 100644 --- a/docs/examples/match-the-kserve-v2-inference-path.md +++ b/docs/examples/match-the-kserve-v2-inference-path.md @@ -66,10 +66,10 @@ The `{/versions/:version}?` group is *optional* — when the segment is absent, the named group is simply not in the result. Same pattern handles both URL shapes. -## Multi-backend routing with `compareComponent()` +## Multi-backend routing with `compare_component()` If you're fronting *several* inference servers — Triton at one URL -prefix, KServe at another, TorchServe at a third — `compareComponent` +prefix, KServe at another, TorchServe at a third — `compare_component` gives you spec-defined specificity ordering rather than insertion-order fragility: @@ -82,7 +82,7 @@ ROUTES = [ ] # Sort by specificity per the spec — no manual "register specific first" discipline. -ROUTES.sort(key=cmp_to_key(lambda a, b: URLPattern.compareComponent("pathname", a, b))) +ROUTES.sort(key=cmp_to_key(lambda a, b: URLPattern.compare_component("pathname", a, b))) ``` ## What you get for free @@ -92,7 +92,7 @@ ROUTES.sort(key=cmp_to_key(lambda a, b: URLPattern.compareComponent("pathname", - **Regex-constrained action enum** — `:action(infer|ready|generate)` rejects `/v2/models/bert/explain` at the pattern level, before any handler dispatch. -- **`compareComponent()` for specificity** — replaces the +- **`compare_component()` for specificity** — replaces the "register specific patterns first" discipline every Python router documents. A spec-defined deterministic ordering means a sidecar can *compute* the right dispatch order from a route list it didn't write. diff --git a/docs/examples/pick-an-llm-backend-by-model-name.md b/docs/examples/pick-an-llm-backend-by-model-name.md index 14d1e87..e573aa2 100644 --- a/docs/examples/pick-an-llm-backend-by-model-name.md +++ b/docs/examples/pick-an-llm-backend-by-model-name.md @@ -34,7 +34,7 @@ def route(self, request: str): `_pattern_to_regex` is essentially URLPattern's `*` wildcard handling re-implemented; `calculate_pattern_specificity` is essentially -`compareComponent()` re-implemented. Both have known footguns — an +`compare_component()` re-implemented. Both have known footguns — an asterisk inside a literal segment, complexity-char counting that ranks two patterns the same when they shouldn't be. @@ -55,7 +55,7 @@ ROUTES: list[tuple[URLPattern, str]] = [ # Spec-defined specificity: more specific patterns sort *before* more general # ones. Replaces LiteLLM's manual "count complexity chars" heuristic. -ROUTES.sort(key=cmp_to_key(lambda a, b: URLPattern.compareComponent("pathname", a[0], b[0]))) +ROUTES.sort(key=cmp_to_key(lambda a, b: URLPattern.compare_component("pathname", a[0], b[0]))) def pick_deployment(request_path: str) -> str | None: for pat, deployment in ROUTES: @@ -71,7 +71,7 @@ pick_deployment("/anthropic/claude-3-haiku") # 'anthropic-fast' ## What you get for free -- **`compareComponent()` is the spec-defined version of LiteLLM's +- **`compare_component()` is the spec-defined version of LiteLLM's `calculate_pattern_specificity`.** Deterministic ordering, no manual "count the wildcards" heuristic, identical results across implementations (Chromium, Safari, Firefox, yarlpattern). diff --git a/docs/examples/translate-google-api-http-to-urlpattern.md b/docs/examples/translate-google-api-http-to-urlpattern.md index 9be5793..b02b4c3 100644 --- a/docs/examples/translate-google-api-http-to-urlpattern.md +++ b/docs/examples/translate-google-api-http-to-urlpattern.md @@ -91,7 +91,7 @@ def route_predict(url): doesn't enforce this directly, but URLPattern can. - **`additional_bindings` ↔ pattern list.** Multiple URLPattern entries pointing at the same handler are the natural representation; - `compareComponent()` gives you the same specificity ordering grpc- + `compare_component()` gives you the same specificity ordering grpc- gateway computes internally. - **Same patterns work everywhere.** A Python sidecar fronting a gRPC-gateway-translated service, a Cloudflare Worker fronting the diff --git a/docs/wpt-compliance.md b/docs/wpt-compliance.md index fd99a5b..063dc6b 100644 --- a/docs/wpt-compliance.md +++ b/docs/wpt-compliance.md @@ -1,6 +1,6 @@ # WHATWG URLPattern Conformance Report -Generated by `scripts/generate_compliance_report.py` on **2026-05-13 04:06:52 UTC**, running against [`web-platform-tests/wpt/urlpattern/`](https://github.com/web-platform-tests/wpt/tree/dd54691426c23a08c6f4a0972b2c40965307e5ce/urlpattern) pinned at [`dd54691`](https://github.com/web-platform-tests/wpt/commit/dd54691426c23a08c6f4a0972b2c40965307e5ce) with regex engine **`regex`** (set-operation support: yes). Suite names match the upstream WPT runner basenames. +Generated by `scripts/generate_compliance_report.py` on **2026-05-13 04:26:17 UTC**, running against [`web-platform-tests/wpt/urlpattern/`](https://github.com/web-platform-tests/wpt/tree/dd54691426c23a08c6f4a0972b2c40965307e5ce/urlpattern) pinned at [`dd54691`](https://github.com/web-platform-tests/wpt/commit/dd54691426c23a08c6f4a0972b2c40965307e5ce) with regex engine **`regex`** (set-operation support: yes). Suite names match the upstream WPT runner basenames. > **Legend.** pass · fail · xfail (known engine gap) · skip · error. diff --git a/scripts/generate_compliance_report.py b/scripts/generate_compliance_report.py index bb7cc2c..cc6309f 100755 --- a/scripts/generate_compliance_report.py +++ b/scripts/generate_compliance_report.py @@ -290,7 +290,7 @@ def _case_id_for(idx: int, entry: dict[str, Any]) -> str: return f"{idx:03d}-{summary}" -# -------------------------- compareComponent / generate harness ------------- +# -------------------------- compare_component / generate harness ------------ def _run_compare_case(idx: int, entry: dict[str, Any]) -> CaseResult: @@ -301,13 +301,13 @@ def _run_compare_case(idx: int, entry: dict[str, Any]) -> CaseResult: right = URLPattern(entry["right"]) component = entry["component"] expected = entry["expected"] - if URLPattern.compareComponent(component, left, right) != expected: + if URLPattern.compare_component(component, left, right) != expected: return CaseResult(idx, case_id, "fail", f"forward != {expected}") - if URLPattern.compareComponent(component, right, left) != -expected: + if URLPattern.compare_component(component, right, left) != -expected: return CaseResult(idx, case_id, "fail", f"reverse != {-expected}") - if URLPattern.compareComponent(component, left, left) != 0: + if URLPattern.compare_component(component, left, left) != 0: return CaseResult(idx, case_id, "fail", "self(left) != 0") - if URLPattern.compareComponent(component, right, right) != 0: + if URLPattern.compare_component(component, right, right) != 0: return CaseResult(idx, case_id, "fail", "self(right) != 0") except Exception as exc: # noqa: BLE001 return CaseResult(idx, case_id, "error", f"{type(exc).__name__}: {exc}") diff --git a/src/yarlpattern/_pattern.py b/src/yarlpattern/_pattern.py index b4968d8..2860b7e 100644 --- a/src/yarlpattern/_pattern.py +++ b/src/yarlpattern/_pattern.py @@ -120,7 +120,7 @@ def _strip_component_prefix_suffix(component: str, value: str) -> str: } -# ------------------------------------------------------- compareComponent +# ------------------------------------------------------- compare_component # # Specificity ordering tables — the orderings here are the WHATWG tentative # spec's intended ranks (which the polyfill and Chromium also implement @@ -149,7 +149,7 @@ def _strip_component_prefix_suffix(component: str, value: str) -> str: PartModifier.NONE: 3, } -# Length-mismatch sentinel — used by :meth:`URLPattern.compareComponent` +# Length-mismatch sentinel — used by :meth:`URLPattern.compare_component` # to pad the shorter part list. An empty fixed-text part is what the spec # substitutes so that ``/foo/`` outranks ``/foo/*``: a literal-ending # pattern is more restrictive than one that wildcards after a common prefix. @@ -228,7 +228,7 @@ class _ComponentMatcher: # genuinely ``""`` in both engines. apply_ecma_narrowing: list[bool] # Pre-built tuple of comparison keys, one per part, used by - # :meth:`URLPattern.compareComponent`. Each key is + # :meth:`URLPattern.compare_component`. Each key is # ``(type_rank, modifier_rank, prefix, value, suffix)``; assembled # once at compile time so every compare-call is a C-level tuple # comparison (no Python-level attribute access on ``Part``). @@ -480,7 +480,7 @@ def _compile_component(self, component: str, pattern_string: str) -> None: for p in parts if p.type is not PartType.FIXED_TEXT ] - # Compare-key tuple for :meth:`compareComponent` — built once at + # Compare-key tuple for :meth:`compare_component` — built once at # compile time so every comparison is a pure C-level tuple-compare. compare_keys = tuple(_part_to_compare_key(p) for p in parts) self._matchers[component] = _ComponentMatcher( @@ -756,8 +756,13 @@ def has_regexp_groups(self) -> bool: """ return any(m.has_custom_regexp for m in self._matchers.values()) + # WHATWG IDL camelCase alias. Snake is the canonical Python form; + # ``hasRegExpGroups`` is kept so code ported verbatim from the spec / + # browser JS / Deno / Bun / Cloudflare-Workers reads identically. + hasRegExpGroups = has_regexp_groups # noqa: N815 + @staticmethod - def compareComponent( # noqa: N802 — matches the WHATWG IDL method name + def compare_component( component: str, left: URLPattern, right: URLPattern, @@ -782,7 +787,7 @@ def compareComponent( # noqa: N802 — matches the WHATWG IDL method name spec-defined names. """ if component not in COMPONENTS: - msg = f"URLPattern.compareComponent: unknown component {component!r}; expected one of {COMPONENTS}" + msg = f"URLPattern.compare_component: unknown component {component!r}; expected one of {COMPONENTS}" raise TypeError(msg) # Empty part lists stand in for ``*`` — see ``_FULL_WILDCARD_ONLY_KEYS``. # Calling ``.compare_keys or _FULL_WILDCARD_ONLY_KEYS`` is a free @@ -801,6 +806,11 @@ def compareComponent( # noqa: N802 — matches the WHATWG IDL method name return -1 if lk < rk else 1 return 0 + # WHATWG IDL camelCase alias. Snake is the canonical Python form; + # ``compareComponent`` is kept so code ported verbatim from the spec / + # browser JS / Deno / Bun / Cloudflare-Workers reads identically. + compareComponent = compare_component # noqa: N815 + # -------------------------------------------------------------- generate def generate(self, component: str, groups: Mapping[str, str] | None = None) -> str: """Produce the URL-component string that *this* pattern would have matched. diff --git a/tests/test_pattern.py b/tests/test_pattern.py index cad4288..fae0193 100644 --- a/tests/test_pattern.py +++ b/tests/test_pattern.py @@ -222,20 +222,20 @@ def test_has_regexp_groups_true_for_custom_regex_body() -> None: assert pat.has_regexp_groups is True -# ------------------------------------------------------------ compareComponent +# ----------------------------------------------------------- compare_component def test_compare_component_rejects_unknown_component_name() -> None: pat = URLPattern({"pathname": "/foo"}) with pytest.raises(TypeError, match="unknown component"): - URLPattern.compareComponent("not-a-component", pat, pat) + URLPattern.compare_component("not-a-component", pat, pat) def test_compare_component_self_equality_across_all_components() -> None: # Self-compare must be 0 on every component, regardless of pattern shape. pat = URLPattern({"pathname": "/foo/:id(\\d+)"}) for component in COMPONENTS: - assert URLPattern.compareComponent(component, pat, pat) == 0 + assert URLPattern.compare_component(component, pat, pat) == 0 def test_compare_component_empty_treated_as_full_wildcard() -> None: @@ -243,8 +243,20 @@ def test_compare_component_empty_treated_as_full_wildcard() -> None: # substitutes the same single-FULL_WILDCARD part list for both. empty = URLPattern({"pathname": ""}) star = URLPattern({"pathname": "*"}) - assert URLPattern.compareComponent("pathname", empty, star) == 0 - assert URLPattern.compareComponent("pathname", star, empty) == 0 + assert URLPattern.compare_component("pathname", empty, star) == 0 + assert URLPattern.compare_component("pathname", star, empty) == 0 + + +def test_camelcase_aliases_resolve_to_same_callable_and_property() -> None: + # ``compareComponent`` and ``hasRegExpGroups`` are kept as IDL-faithful + # camelCase aliases so code ported verbatim from the spec / browser JS + # reads identically. They must dispatch to the snake-case canonical + # forms, not duplicate the logic. + pat = URLPattern({"pathname": "/foo/:id(\\d+)"}) + other = URLPattern({"pathname": "/foo/:id(\\d+)"}) + assert URLPattern.compareComponent is URLPattern.compare_component + assert URLPattern.compareComponent("pathname", pat, other) == 0 + assert pat.hasRegExpGroups is pat.has_regexp_groups # ---------------------------------------------------------------------- with_ diff --git a/tests/test_wpt_compare.py b/tests/test_wpt_compare.py index 7745c6b..17a8d46 100644 --- a/tests/test_wpt_compare.py +++ b/tests/test_wpt_compare.py @@ -1,8 +1,9 @@ """Port of ``reference/wpt/urlpattern/resources/urlpattern-compare-tests.tentative.js``. -The compare suite tests :meth:`URLPattern.compareComponent` — a static -method that returns a three-way comparison between two patterns for a -single component. URL routing libraries use it to order patterns from +The compare suite tests :meth:`URLPattern.compare_component` (also exposed +under the IDL-faithful ``compareComponent`` alias) — a static method that +returns a three-way comparison between two patterns for a single +component. URL routing libraries use it to order patterns from most-specific to least-specific. The corresponding WPT file is marked ``.tentative`` because the spec @@ -60,10 +61,10 @@ def test_wpt_compare(entry: dict[str, Any]) -> None: component: str = entry["component"] expected: int = entry["expected"] - assert URLPattern.compareComponent(component, left, right) == expected + assert URLPattern.compare_component(component, left, right) == expected # Reverse: JS uses ``~~(expected * -1)`` to coerce ``-0`` to ``0``; # Python's ints have no negative-zero, so a plain negation is enough. - assert URLPattern.compareComponent(component, right, left) == -expected + assert URLPattern.compare_component(component, right, left) == -expected # Self-equality. - assert URLPattern.compareComponent(component, left, left) == 0 - assert URLPattern.compareComponent(component, right, right) == 0 + assert URLPattern.compare_component(component, left, left) == 0 + assert URLPattern.compare_component(component, right, right) == 0