From cb78d4e05215b5888e6cb0a1777efe9115186bd7 Mon Sep 17 00:00:00 2001
From: Philippe Ombredanne <pombredanne@aboutcode.org>
Date: Fri, 15 May 2026 18:51:24 +0200
Subject: [PATCH 1/4] Rename README to rst

Signed-off-by: Philippe Ombredanne <pombredanne@aboutcode.org>
---
 etc/bench/README.md  | 258 ---------------------------------------
 etc/bench/README.rst | 284 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 284 insertions(+), 258 deletions(-)
 delete mode 100644 etc/bench/README.md
 create mode 100644 etc/bench/README.rst

diff --git a/etc/bench/README.md b/etc/bench/README.md
deleted file mode 100644
index c28e595..0000000
--- a/etc/bench/README.md
+++ /dev/null
@@ -1,258 +0,0 @@
-# PurlValidator data structure evaluation
-
-This document details the research and evaluation of various efficient data
-structures for compact PURLs storage and lookup.
-
-It contains:
-
-- reference to evaluation/bench scripts
-- documentation on the various libraries and data structures under consideration
-- the final choice (spoiler an FST, aka. finite state transducer)
-
-
-## Context and Problem
-
-PurlValidator needs a local queryable dataset of known PURLs to answer one question:
-
-> Does this PURL exist in the reference dataset?
-
-The lookup index should be built for each release, and shipped with the library
-for access without a network connection. And we want a Go, Rust and Python
-implementation. The PURls themselves are collected using PurlDB and FederatedCode.
-
-
-## Solution
-
-### High level design
-
-The lookup key is a PURL, cleaned to only keep type, namespace, and name,
-(without version, qualifiers and subpath)
-
-This keeps validation focused for now. Version validation could come later by
-extending indexed PURLs with version or baking in support VERS version parsing
-for validation
-
-### Solution elements: Data structures considered
-
-- Built-in set and map
-- FST
-- DAWG
-- Bloom filter
-- SQLite
-
-Considered but not evaluated:
-
-- Minimal perfect hash: no compression
-- Trie or radix tree: DAWG and FST are similar, but are more compact. Suffix
-  trees are way too big.
-
-#### Built-in set and map
-
-Built-in sets and maps are the simplest baseline in each language, they are as
-fast as can be, but they have no compression and no built-in serialization or
-memory mapping, and memory use grows quickly for large datasets.
-
-An interesting path could be to use built-in sets in Rust and Go generating the
-code with all the PURL strings so that there is no specific deserialization. The
-porblem there is the size as the data is not compressed.
-
-Built-ins structures are useful for benchmarks as reference but are not suitable
-as the main packaged data structure because they are too big.
-
-
-#### FST: finite state transducer
-
-<https://en.wikipedia.org/wiki/Finite-state_transducer>
-
-An FST stores a sorted set of strings in a compact automaton. PURLs share common
-prefixes such as `pkg:npm/`, `pkg:pypi/`, and `pkg:maven/`. This sharing helps
-reduce stored data.
-
-FST lookup is exact for this use case. The Rust and Go implementations already
-ship an FST file. The library opens or embeds that file and performs membership
-checks without rebuilding the index.
-
-The main cost is build complexity. Input must be prepared, sorted, and encoded
-when the package data is refreshed.
-
-
-#### DAWG: directed acyclic word graph
-
-See <https://stevehanov.ca/blog/compressing-dictionaries-with-a-dawg>
-
-this is aka. DAFSA
-<https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton>
-
-A DAWG is a compact data structure for a set of strings. It can merge repeated
-prefixes and suffixes like an FST. The DAWG is interesting in that it can
-support prefix lookup, but in general the DAWG is bigger and slower than an FST,
-and has fewer mature/maintained library support.
-
-
-#### Bloom filter
-
-<https://en.wikipedia.org/wiki/Bloom_filter>
-
-A Bloom filter can store a large set in a small space, but it is a probalistic
-structure and can answer that a value is surely absent or maybe present. In that
-later case, you need an extra full dataset to validate further the "maybe": this
-is the problem of false positives with these filters, hence a Bloom filter
-cannot not be used as the only lookup structure, and does not make sense here.
-Instead, a Bloom filter could be used before an exact structure to skip some
-exact lookups as performance optimization, but outside of the validator.
-
-
-#### SQLite
-
-<https://sqlite.org/>
-
-SQLite can store PURLs in a SQL table with an index for exact lookup.
-
-The tradeoff is operational weight. Each SQLite language binding adds a
-dependency (though this is built in Python). The validator only needs immutable
-membership checks, not SQL full power with queries, and update transactions; but
-on the other hand the SQLite DB could be the same across all languages.
-
-SQLite could useful as a benchmark and debugging format. It is not the first
-choice for a small language library because this is not compressed. But it will
-be a future enhancement for sure.
-
-
-### Preferred solution: FST
-
-Based on the benchmark and otrher criteria, let's use an FST-backed lookup for
-every languages. Do not use a Bloom filter (probalistic). Do not use native
-structures that use too much memory.
-
-And for the library selection, we have these high level requirements:
-
-- We want exact result without false positives, e.g., no bloom filter.
-- Offline use, with no network is a must: the dataset must be bundled in the
-  releases.
-- With build time index construction, the construction time is not critical.
-- The bundled index should be small enough to ship below crates, and Pypi
-  archive size limits.
-- No rebuild at startup/runtime, and fast enough load time from disk, ideally
-  memory-mapped.
-- Fast enough lookup.
-- Libraries should be maintained, active FOSS for Rust/Go/Python.
-
-The final selected FST libraries are:
-
-- Rust: fst crate with a memory-mapped set <https://github.com/BurntSushi/fst/>
-- Python: ducer with a memory-mapped map, dict-like
-  <https://github.com/jfolz/ducer> (ducer uses the Rust fst crate inside)
-- Go: vellum "fst" module (originally from
-  <https://github.com/couchbase/vellum> now at
-  <https://github.com/blevesearch/vellum>) which is mostly inspired from the
-  Rust fst crate
-
-
-## Appendix: Benchmarks
-
-This directory contains evaluation and benchmark files for PurlValidator.
-
-It compares structures for offline PURL membership checks with these
-implementations use:
-
-- Python: memory-mapped `ducer`.
-- Rust: crate `fst`.
-- Go: embedded Vellum FST.
-
-... as well as the builtin Python set and dict, SQLite and a Rust DAWG
-
-### Expected checkout layout
-
-Run the scripts from a directory with these repositories checkouts:
-
-- `/purl-validator`
-- `/purl-validator.rs`
-- `/purlvalidator-go`
-
-### benchmarking FST vs. DAWG
-
-There is a good benchmarch in Go comparing FST and DAWG data structures (and
-other structures) that highlights why an FST is a better structure for our cases
-than a DAWG:
-
-<https://github.com/timurgarif/go-fsa-trie-bench>
-
-We also did a simple synthetic benchmark of the Rust fst and dawg crates using
-actual base PURLs using the data in
-<https://github.com/aboutcode-org/purl-validator.rs/tree/main/fst_builder/data>
-
-The `etc/bench/rust-fst-dawg-bench` code compare these fst and dawg crates.
-
-The dataset profile has 2,324,119 unique sorted base PURL. The benchmark is to
-run 1M queries, where 500K are expected to fail.
-
-- The fst crate index was built in 11s, with a 26MB serialized file, and took
-  0.703s for 1M lookups.
-- The dawg crate index was built in 18s, with a 831MB serialized file, and took
-  28s for 1M lookups.
-
-The outcome is that the preferred structure is an FST over a DAWG (at least
-with these implementations).
-
-### benchmarking FST against builtin and SQLite
-
-Since we picked the FST as the winner, additional review has been focused on
-Python by comparing the ducer fst library against other approaches. Since it is
-based on the Rust fst and Go's vellum is also based on the fst design, we cover
-essentially the three languages at once.
-
-The `etc/scripts/bench/alternative_benchmark.py` script compares Python lookup
-using a text file with one PURL per line for these candidates:
-
-- Python `set`.
-- Python `dict`.
-- Python Sorted list plus `bisect`.
-- In-memory SQLite.
-- FST using a `ducer.Map`.
-
-Data is from `purl-validator.rs/fst_builder/data/`
-
-Results with 2,324,119 unique PURLs and 1M lookup queries, 500K existing PURLs:
-
-```text
-structure               build (secs)   lookup (secs)   storage size
---------------------   ------------   --------------   ---------------------------
-python set               0.206540       0.275906        304MB in RAM
-python dict              0.449625       0.429034        298MB in RAM
-ducer FST                3.700943       1.805585         26MB on disk
-sorted list+bisect       0.017540       2.783555        236MB in RAM
-sqlite in memory         4.855480       4.220032        207MB on disk (or 65MB with zstd)
-```
-
-### benchmarking FST in Python vs. Go vs. Rust
-
-This benchmark runs each of the three validator released implementations. The
-script is in `etc/scripts/bench/go-rust-py_benchmark.py`
-
-Data is from `purl-validator.rs/fst_builder/data/`
-
-Results with 2,324,119 unique PURLs and 1M lookup queries, 500K existing PURLs:
-
-```text
-structure               build (secs)   lookup (secs)   storage size (ondisk)
---------------------   ------------   --------------   ---------------------------
-Python purl-validator    16.664847      4.926029         25MB
-Rust purl-validator.rs   11.849877      0.348128         25MB
-Go purlvalidator-go       2.325181      0.704749         25MB
-```
-
-### Evaluation
-
-The results are consistent with expectations: Rust is faster than Go and Python.
-
-And the Python on disk fst is the same size as the Rust fst (since this is the
-same backing code).
-
-Some surprises:
-
-- The build of the Go index is the fastest which is surprising and could be an
-  avenue of improvement for the Rust fst crate.
-
-- Leaving aside the 10x larger RAM need, the Python set and dict are competitive
-  speed wise (faster than the on-disk Rust FST) ans super fast to build too.
-  
diff --git a/etc/bench/README.rst b/etc/bench/README.rst
new file mode 100644
index 0000000..677c3ac
--- /dev/null
+++ b/etc/bench/README.rst
@@ -0,0 +1,284 @@
+PurlValidator data structure evaluation
+=======================================
+
+This document details the research and evaluation of various efficient
+data structures for compact PURLs storage and lookup.
+
+It contains:
+
+-  reference to evaluation/bench scripts
+-  documentation on the various libraries and data structures under
+   consideration
+-  the final choice (spoiler an FST, aka. finite state transducer)
+
+Context and Problem
+-------------------
+
+PurlValidator needs a local queryable dataset of known PURLs to answer
+one question:
+
+   Does this PURL exist in the reference dataset?
+
+The lookup index should be built for each release, and shipped with the
+library for access without a network connection. And we want a Go, Rust
+and Python implementation. The PURls themselves are collected using
+PurlDB and FederatedCode.
+
+Solution
+--------
+
+High level design
+~~~~~~~~~~~~~~~~~
+
+The lookup key is a PURL, cleaned to only keep type, namespace, and
+name, (without version, qualifiers and subpath)
+
+This keeps validation focused for now. Version validation could come
+later by extending indexed PURLs with version or baking in support VERS
+version parsing for validation
+
+Solution elements: Data structures considered
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+-  Built-in set and map
+-  FST
+-  DAWG
+-  Bloom filter
+-  SQLite
+
+Considered but not evaluated:
+
+-  Minimal perfect hash: no compression
+-  Trie or radix tree: DAWG and FST are similar, but are more compact.
+   Suffix trees are way too big.
+
+Built-in set and map
+^^^^^^^^^^^^^^^^^^^^
+
+Built-in sets and maps are the simplest baseline in each language, they
+are as fast as can be, but they have no compression and no built-in
+serialization or memory mapping, and memory use grows quickly for large
+datasets.
+
+An interesting path could be to use built-in sets in Rust and Go
+generating the code with all the PURL strings so that there is no
+specific deserialization. The porblem there is the size as the data is
+not compressed.
+
+Built-ins structures are useful for benchmarks as reference but are not
+suitable as the main packaged data structure because they are too big.
+
+FST: finite state transducer
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+https://en.wikipedia.org/wiki/Finite-state_transducer
+
+An FST stores a sorted set of strings in a compact automaton. PURLs
+share common prefixes such as ``pkg:npm/``, ``pkg:pypi/``, and
+``pkg:maven/``. This sharing helps reduce stored data.
+
+FST lookup is exact for this use case. The Rust and Go implementations
+already ship an FST file. The library opens or embeds that file and
+performs membership checks without rebuilding the index.
+
+The main cost is build complexity. Input must be prepared, sorted, and
+encoded when the package data is refreshed.
+
+DAWG: directed acyclic word graph
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+See https://stevehanov.ca/blog/compressing-dictionaries-with-a-dawg
+
+this is aka. DAFSA
+https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton
+
+A DAWG is a compact data structure for a set of strings. It can merge
+repeated prefixes and suffixes like an FST. The DAWG is interesting in
+that it can support prefix lookup, but in general the DAWG is bigger and
+slower than an FST, and has fewer mature/maintained library support.
+
+Bloom filter
+^^^^^^^^^^^^
+
+https://en.wikipedia.org/wiki/Bloom_filter
+
+A Bloom filter can store a large set in a small space, but it is a
+probalistic structure and can answer that a value is surely absent or
+maybe present. In that later case, you need an extra full dataset to
+validate further the “maybe”: this is the problem of false positives
+with these filters, hence a Bloom filter cannot not be used as the only
+lookup structure, and does not make sense here. Instead, a Bloom filter
+could be used before an exact structure to skip some exact lookups as
+performance optimization, but outside of the validator.
+
+SQLite
+^^^^^^
+
+https://sqlite.org/
+
+SQLite can store PURLs in a SQL table with an index for exact lookup.
+
+The tradeoff is operational weight. Each SQLite language binding adds a
+dependency (though this is built in Python). The validator only needs
+immutable membership checks, not SQL full power with queries, and update
+transactions; but on the other hand the SQLite DB could be the same
+across all languages.
+
+SQLite could useful as a benchmark and debugging format. It is not the
+first choice for a small language library because this is not
+compressed. But it will be a future enhancement for sure.
+
+Preferred solution: FST
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Based on the benchmark and otrher criteria, let’s use an FST-backed
+lookup for every languages. Do not use a Bloom filter (probalistic). Do
+not use native structures that use too much memory.
+
+And for the library selection, we have these high level requirements:
+
+-  We want exact result without false positives, e.g., no bloom filter.
+-  Offline use, with no network is a must: the dataset must be bundled
+   in the releases.
+-  With build time index construction, the construction time is not
+   critical.
+-  The bundled index should be small enough to ship below crates, and
+   Pypi archive size limits.
+-  No rebuild at startup/runtime, and fast enough load time from disk,
+   ideally memory-mapped.
+-  Fast enough lookup.
+-  Libraries should be maintained, active FOSS for Rust/Go/Python.
+
+The final selected FST libraries are:
+
+-  Rust: fst crate with a memory-mapped set
+   https://github.com/BurntSushi/fst/
+-  Python: ducer with a memory-mapped map, dict-like
+   https://github.com/jfolz/ducer (ducer uses the Rust fst crate inside)
+-  Go: vellum “fst” module (originally from
+   https://github.com/couchbase/vellum now at
+   https://github.com/blevesearch/vellum) which is mostly inspired from
+   the Rust fst crate
+
+Appendix: Benchmarks
+--------------------
+
+This directory contains evaluation and benchmark files for
+PurlValidator.
+
+It compares structures for offline PURL membership checks with these
+implementations use:
+
+-  Python: memory-mapped ``ducer``.
+-  Rust: crate ``fst``.
+-  Go: embedded Vellum FST.
+
+… as well as the builtin Python set and dict, SQLite and a Rust DAWG
+
+Expected checkout layout
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+Run the scripts from a directory with these repositories checkouts:
+
+-  ``/purl-validator``
+-  ``/purl-validator.rs``
+-  ``/purlvalidator-go``
+
+benchmarking FST vs. DAWG
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There is a good benchmarch in Go comparing FST and DAWG data structures
+(and other structures) that highlights why an FST is a better structure
+for our cases than a DAWG:
+
+https://github.com/timurgarif/go-fsa-trie-bench
+
+We also did a simple synthetic benchmark of the Rust fst and dawg crates
+using actual base PURLs using the data in
+https://github.com/aboutcode-org/purl-validator.rs/tree/main/fst_builder/data
+
+The ``etc/bench/rust-fst-dawg-bench`` code compare these fst and dawg
+crates.
+
+The dataset profile has 2,324,119 unique sorted base PURL. The benchmark
+is to run 1M queries, where 500K are expected to fail.
+
+-  The fst crate index was built in 11s, with a 26MB serialized file,
+   and took 0.703s for 1M lookups.
+-  The dawg crate index was built in 18s, with a 831MB serialized file,
+   and took 28s for 1M lookups.
+
+The outcome is that the preferred structure is an FST over a DAWG (at
+least with these implementations).
+
+benchmarking FST against builtin and SQLite
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Since we picked the FST as the winner, additional review has been
+focused on Python by comparing the ducer fst library against other
+approaches. Since it is based on the Rust fst and Go’s vellum is also
+based on the fst design, we cover essentially the three languages at
+once.
+
+The ``etc/scripts/bench/alternative_benchmark.py`` script compares
+Python lookup using a text file with one PURL per line for these
+candidates:
+
+-  Python ``set``.
+-  Python ``dict``.
+-  Python Sorted list plus ``bisect``.
+-  In-memory SQLite.
+-  FST using a ``ducer.Map``.
+
+Data is from ``purl-validator.rs/fst_builder/data/``
+
+Results with 2,324,119 unique PURLs and 1M lookup queries, 500K existing
+PURLs:
+
+.. code:: text
+
+   structure               build (secs)   lookup (secs)   storage size
+   --------------------   ------------   --------------   ---------------------------
+   python set               0.206540       0.275906        304MB in RAM
+   python dict              0.449625       0.429034        298MB in RAM
+   ducer FST                3.700943       1.805585         26MB on disk
+   sorted list+bisect       0.017540       2.783555        236MB in RAM
+   sqlite in memory         4.855480       4.220032        207MB on disk (or 65MB with zstd)
+
+benchmarking FST in Python vs. Go vs. Rust
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This benchmark runs each of the three validator released
+implementations. The script is in
+``etc/scripts/bench/go-rust-py_benchmark.py``
+
+Data is from ``purl-validator.rs/fst_builder/data/``
+
+Results with 2,324,119 unique PURLs and 1M lookup queries, 500K existing
+PURLs:
+
+.. code:: text
+
+   structure               build (secs)   lookup (secs)   storage size (ondisk)
+   --------------------   ------------   --------------   ---------------------------
+   Python purl-validator    16.664847      4.926029         25MB
+   Rust purl-validator.rs   11.849877      0.348128         25MB
+   Go purlvalidator-go       2.325181      0.704749         25MB
+
+Evaluation
+~~~~~~~~~~
+
+The results are consistent with expectations: Rust is faster than Go and
+Python.
+
+And the Python on disk fst is the same size as the Rust fst (since this
+is the same backing code).
+
+Some surprises:
+
+-  The build of the Go index is the fastest which is surprising and
+   could be an avenue of improvement for the Rust fst crate.
+
+-  Leaving aside the 10x larger RAM need, the Python set and dict are
+   competitive speed wise (faster than the on-disk Rust FST) ans super
+   fast to build too.

From 7fe3516b7292dd039d8343983c0b5ea7687742d6 Mon Sep 17 00:00:00 2001
From: Philippe Ombredanne <pombredanne@aboutcode.org>
Date: Fri, 15 May 2026 18:57:41 +0200
Subject: [PATCH 2/4] Update README

Signed-off-by: Philippe Ombredanne <pombredanne@aboutcode.org>
---
 README.md | 70 ++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 56 insertions(+), 14 deletions(-)

diff --git a/README.md b/README.md
index 39036fb..f84aaf6 100644
--- a/README.md
+++ b/README.md
@@ -4,7 +4,9 @@
 [![Version](https://img.shields.io/github/v/release/aboutcode-org/purl-validator?style=for-the-badge)](https://github.com/aboutcode-org/purl-validator/releases)
 [![Test](https://img.shields.io/github/actions/workflow/status/aboutcode-org/purl-validator/ci.yml?style=for-the-badge&logo=github)](https://github.com/aboutcode-org/purl-validator/actions)
 
-**purl-validator** is a Python library for validating [Package URLs (PURLs)](https://github.com/package-url/purl-spec). It works fully offline, including in **air-gapped** or **restricted environments**, and answers one key question: **Does the package this PURL represents actually exist?**
+**purl-validator** is a Python library for validating [Package URLs (PURLs)](https://github.com/package-url/purl-spec). 
+It works fully offline, including in **air-gapped** or **restricted environments**, 
+and answers one key question: **Does the package this PURL represents actually exist?**
 
 ## How Does It Work?
 
@@ -12,18 +14,18 @@
 
 ## Currently Supported Ecosystems
 
-- **apk**
-- **cargo**
-- **composer**
-- **conan**
-- **cpan**
-- **cran**
-- **debian**
-- **maven**
-- **npm**
-- **nuget**
-- **pypi**
-- **swift**
+- apk
+- cargo
+- composer
+- conan
+- cpan
+- cran
+- debian
+- maven
+- npm
+- nuget
+- pypi
+- swift
 
 ## Usage
 
@@ -47,6 +49,46 @@ PurlValidator.validate_purl("pkg:nuget/FluentValidation")
 PurlValidator.validate_purl("pkg:nuget/non-existent-foo-bar")
 >>> False
 ```
+The validator accepts a PURL string or a `packageurl.PackageURL` object:
+
+```python
+from packageurl import PackageURL
+from purl_validator import PurlValidator
+
+validator = PurlValidator()
+purl = PackageURL(type="npm", namespace="@angular", name="core")
+
+exists = validator.validate_purl(purl)
+print(exists)
+```
+
+Only the base PURL is used for queries (e.g., oonly package type/namespace/name.)
+Version, qualifiers, and subpath are not part of the query:
+
+```python
+from purl_validator import create_purl_map_entry
+
+assert create_purl_map_entry("pkg:pypi/django@5.0.0") == b"pypi/django"
+```
+
+You can also build and load a custom index for tests or experiments:
+
+```python
+from purl_validator import PurlValidator
+from purl_validator import create_purl_map
+
+purl_map_location = create_purl_map([
+    "pkg:pypi/django",
+    "pkg:npm/%40angular/core",
+])
+
+validator = PurlValidator(purl_map_location)
+assert validator.validate_purl("pkg:pypi/django") is True
+assert validator.validate_purl("pkg:pypi/not-a-real-package-name") is False
+```
+
+Use one `PurlValidator` instance for many lookups. Creating the instance loads
+the packaged map, while each validation is an exact membership check.
 
 ## Contribution
 
@@ -91,4 +133,4 @@ limitations under the License.
 ```
 
 [^1]: MineCode continuously collects package metadata from various package ecosystems to maintain an up-to-date catalog of known packages.
-[^2]: A Base Package URL is a Package URL without a version, qualifiers or subpath.
+[^2]: A Base Package URL is a Package URL without a version, qualifiers, or subpath.

From 64931fd4dade6c2d8d724ccbdc78ac581d14de78 Mon Sep 17 00:00:00 2001
From: Philippe Ombredanne <pombredanne@aboutcode.org>
Date: Fri, 15 May 2026 19:14:30 +0200
Subject: [PATCH 3/4] Update CI for docs and releases.

Signed-off-by: Philippe Ombredanne <pombredanne@aboutcode.org>
---
 .github/workflows/docs-ci.yml      |  7 +++++--
 .github/workflows/pypi-release.yml | 24 +++++++++++++++---------
 .github/workflows/zizmor.yml       | 24 ++++++++++++++++++++++++
 3 files changed, 44 insertions(+), 11 deletions(-)
 create mode 100644 .github/workflows/zizmor.yml

diff --git a/.github/workflows/docs-ci.yml b/.github/workflows/docs-ci.yml
index 8d8aa55..fbc267f 100644
--- a/.github/workflows/docs-ci.yml
+++ b/.github/workflows/docs-ci.yml
@@ -2,6 +2,7 @@ name: CI Documentation
 
 on: [push, pull_request]
 
+permissions: {}
 jobs:
   build:
     runs-on: ubuntu-24.04
@@ -13,10 +14,12 @@ jobs:
 
     steps:
       - name: Checkout code
-        uses: actions/checkout@v4
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
+        with:
+          persist-credentials: false
 
       - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v5
+        uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405
         with:
           python-version: ${{ matrix.python-version }}
 
diff --git a/.github/workflows/pypi-release.yml b/.github/workflows/pypi-release.yml
index a461f63..7b5a13a 100644
--- a/.github/workflows/pypi-release.yml
+++ b/.github/workflows/pypi-release.yml
@@ -18,17 +18,20 @@ on:
     tags:
       - "v*.*.*"
 
+permissions: {}
 jobs:
   build-pypi-distribs:
     name: Build and publish library to PyPI
     runs-on: ubuntu-24.04
 
     steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
+        with:
+          persist-credentials: false
       - name: Set up Python
-        uses: actions/setup-python@v5
+        uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405
         with:
-          python-version: 3.12
+          python-version: 3.13
 
       - name: Install pypa/build and twine
         run: python -m pip install --user --upgrade build twine pkginfo
@@ -36,17 +39,20 @@ jobs:
       - name: Build a binary wheel and a source tarball
         run: python -m build --wheel --sdist --outdir dist/
 
-      - name: Validate wheel and sdis for Pypi
+      - name: Validate wheels and sdists for Pypi
         run: python -m twine check dist/*
 
       - name: Upload built archives
-        uses: actions/upload-artifact@v4
+        uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f
         with:
           name: pypi_archives
           path: dist/*
 
 
   create-gh-release:
+  # Sets permissions of the GITHUB_TOKEN to allow release upload
+    permissions:
+      contents: write
     name: Create GH release
     needs:
       - build-pypi-distribs
@@ -54,13 +60,13 @@ jobs:
 
     steps:
       - name: Download built archives
-        uses: actions/download-artifact@v4
+        uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131
         with:
           name: pypi_archives
           path: dist
 
       - name: Create GH release
-        uses: softprops/action-gh-release@v2
+        uses: softprops/action-gh-release@b4309332981a82ec1c5618f44dd2e27cc8bfbfda
         with:
           draft: true
           files: dist/*
@@ -77,11 +83,11 @@ jobs:
 
     steps:
       - name: Download built archives
-        uses: actions/download-artifact@v4
+        uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131
         with:
           name: pypi_archives
           path: dist
 
       - name: Publish to PyPI
         if: startsWith(github.ref, 'refs/tags')
-        uses: pypa/gh-action-pypi-publish@release/v1
+        uses: pypa/gh-action-pypi-publish@cef221092ed1bacb1cc03d23a2d87d1d172e277b
\ No newline at end of file
diff --git a/.github/workflows/zizmor.yml b/.github/workflows/zizmor.yml
new file mode 100644
index 0000000..aa8259d
--- /dev/null
+++ b/.github/workflows/zizmor.yml
@@ -0,0 +1,24 @@
+name: GitHub Actions Security Analysis with zizmor 🌈
+
+on:
+  push:
+    branches: ["main"]
+  pull_request:
+    branches: ["**"]
+
+permissions: {}
+
+jobs:
+  zizmor:
+    name: Run zizmor 🌈
+    runs-on: ubuntu-latest
+    permissions:
+      security-events: write # Required for upload-sarif (used by zizmor-action) to upload SARIF files.
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
+        with:
+          persist-credentials: false
+
+      - name: Run zizmor 🌈
+        uses: zizmorcore/zizmor-action@b1d7e1fb5de872772f31590499237e7cce841e8e # v0.5.3

From caf5703fa32305019226cc02f12b8edabe821aa1 Mon Sep 17 00:00:00 2001
From: Philippe Ombredanne <pombredanne@aboutcode.org>
Date: Fri, 15 May 2026 19:19:27 +0200
Subject: [PATCH 4/4] Add combined documentation

Signed-off-by: Philippe Ombredanne <pombredanne@aboutcode.org>
---
 docs/source/conf.py                      |   8 +-
 docs/source/contribute/contrib_doc.rst   |   2 +-
 docs/source/data-structure-rationale.rst |  83 ++++++++++++
 docs/source/explanations.rst             |  31 +++++
 docs/source/how-to-guides.rst            |  36 +++++
 docs/source/index.rst                    |  59 +++++++--
 docs/source/introduction.rst             |  47 +++++++
 docs/source/quickstart.rst               |  84 ++++++++++++
 docs/source/reference.rst                |  75 +++++++++++
 docs/source/skeleton-usage.rst           | 160 -----------------------
 docs/source/tutorials.rst                |  45 +++++++
 11 files changed, 456 insertions(+), 174 deletions(-)
 create mode 100644 docs/source/data-structure-rationale.rst
 create mode 100644 docs/source/explanations.rst
 create mode 100644 docs/source/how-to-guides.rst
 create mode 100644 docs/source/introduction.rst
 create mode 100644 docs/source/quickstart.rst
 create mode 100644 docs/source/reference.rst
 delete mode 100644 docs/source/skeleton-usage.rst
 create mode 100644 docs/source/tutorials.rst

diff --git a/docs/source/conf.py b/docs/source/conf.py
index 056ca6e..410c3bd 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -17,7 +17,7 @@
 
 # -- Project information -----------------------------------------------------
 
-project = "nexb-skeleton"
+project = "purl-validator"
 copyright = "nexB Inc., AboutCode and others."
 author = "AboutCode.org authors and contributors"
 
@@ -79,9 +79,9 @@
 
 html_context = {
     "display_github": True,
-    "github_user": "nexB",
-    "github_repo": "nexb-skeleton",
-    "github_version": "develop",  # branch
+    "github_user": "aboutcode-org",
+    "github_repo": "purl-validator",
+    "github_version": "main",  # branch
     "conf_py_path": "/docs/source/",  # path in the checkout to the docs root
 }
 
diff --git a/docs/source/contribute/contrib_doc.rst b/docs/source/contribute/contrib_doc.rst
index 2a719a5..b160bc5 100644
--- a/docs/source/contribute/contrib_doc.rst
+++ b/docs/source/contribute/contrib_doc.rst
@@ -187,7 +187,7 @@ Style Conventions for the Documentaion
 
     (`Refer <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#sections>`_)
     Normally, there are no heading levels assigned to certain characters as the structure is
-    determined from the succession of headings. However, this convention is used in Python’s Style
+    determined from the succession of headings. However, this convention is used in Python's Style
     Guide for documenting which you may follow:
 
     # with overline, for parts
diff --git a/docs/source/data-structure-rationale.rst b/docs/source/data-structure-rationale.rst
new file mode 100644
index 0000000..c2e0f46
--- /dev/null
+++ b/docs/source/data-structure-rationale.rst
@@ -0,0 +1,83 @@
+.. _data_structure_rationale:
+
+FST Data Structure Rationale
+=============================
+
+PurlValidator needs exact membership lookup for a large list of base PURLs. The
+lookup data index is built before release and bundled with each library.
+
+
+See https://github.com/aboutcode-org/purl-validator/tree/main/etc/bench for
+actual detailed rationale and bench for the choice of an FST.
+
+
+Why FSTs are used?
+------------------
+
+Finite state transducers store sorted strings in a compact form. PURLs share
+prefixes such as ``pkg:npm/``, ``pkg:pypi/``, and ``pkg:maven/``. This makes an
+FST useful for exact package identity queries.
+
+FST can be memory-mapped and are super compact. They are not as fast as native
+set, but the memory consumption is so much lower than this make them the most
+attractive solution, even if it takes more time to build.
+
+
+Requirements
+---------------
+
+The index structure should provide:
+
+And for the library selection, we have these high level requirements:
+
+- We want exact result without false positives, e.g., no bloom filter.
+- Offline use, with no network is a must: the dataset must be bundled
+  in the releases.
+- With build time index construction, the construction time is not
+  critical.
+- The bundled index should be small enough to ship below crates, and
+  Pypi archive size limits.
+- No rebuild at startup/runtime, and fast enough load time from disk,
+  ideally memory-mapped.
+- Fast enough lookup.
+- Libraries should be maintained, active FOSS for Rust/Go/Python.
+
+
+
+
+Selected FST libraries
+--------------------------
+
+Python uses ``ducer.Map`` with ``mmap``. The map is stored on disk and opened
+without loading the full catalog into Python objects.
+
+Rust uses ``fst::Set``. The generated FST is embedded into the crate.
+
+Go uses Vellum FST. The generated FST is embedded into the module.
+
+Alternatives
+------------
+
+We considered also built-in sets and maps as a baseline:
+
+- Python: ``set`` and ``dict``.
+- Rust: ``HashSet`` and ``HashMap``.
+- Go: ``map[string]struct{}`` and ``map[string]bool``.
+
+These structures are simple and fast. They require loading all keys into
+runtime memory, so they are less useful as the packaged lookup format.
+
+Sorted arrays or slices can use binary search. They are simple and exact, but
+lookup takes repeated string comparisons and the strings still need to be
+loaded.
+
+SQLite can store the PURLs in an indexed table. It gives exact results, but it
+adds a database dependency for a read-only membership check. It has way more
+features than needed and is overkill for our use case.
+
+Bloom filters are small and fast, but they can return false positives. They
+should cannot be used as validation index.
+
+A DAWG can store a set of strings by sharing prefixes and suffixes. It may be a
+valid alternative to an FST (it is very similar to) but there are few maintained
+libraries in the target languages.
diff --git a/docs/source/explanations.rst b/docs/source/explanations.rst
new file mode 100644
index 0000000..d5b6185
--- /dev/null
+++ b/docs/source/explanations.rst
@@ -0,0 +1,31 @@
+.. _explanations:
+
+Explanations
+============
+
+Syntax validation and identity validation
+-----------------------------------------
+
+The Package-URL spec defines the PURL format. A PURL can follow the spec
+format and still name a package that is not known in the package ecosystems.
+
+PurlValidator checks the package PURL against reference data of known PURLs. This
+helps find misspelled names, wrong package types, and PURL that
+do not appear in the reference upstream ecosystem package repositories.
+
+
+Offline validation
+------------------
+
+SBOM and compliance workflows may run in CI systems, private networks, or
+air-gapped environments. PurlValidator packages lookup data with each released
+library so validation does not need a network registry access at runtime.
+
+
+Base PURL validation
+--------------------
+
+PURL existence is checked before version existence.
+
+The current libraries validate base PURLs only, no versions. Version support
+can be a future enhancement.
diff --git a/docs/source/how-to-guides.rst b/docs/source/how-to-guides.rst
new file mode 100644
index 0000000..7fc0ec4
--- /dev/null
+++ b/docs/source/how-to-guides.rst
@@ -0,0 +1,36 @@
+.. _how_to_guides:
+
+How-to Guides
+=============
+
+Choose an implementation
+------------------------
+
+Use the implementation that matches the application:
+
+- Use Python for Python scripts, data pipelines, etc.
+- Use Rust for Rust appss.
+- Use Go for Go apps and command-line tools.
+
+All implementations package PURL index data with the released library.
+
+
+Update validation data
+----------------------
+
+PurlValidator index data is released with each package. To update the
+data used by an application, update the PurlValidator package version.
+
+
+Validation results
+--------------------------
+
+Treat validation results in these groups:
+
+- Known: the PURL is valid and exists in the reference data.
+- Unknown: the PURL is valid (parsing) but not present in the reference data.
+- Invalid or unsupported: the input is not a supported or known PURL.
+
+For SBOM checks, you should report unknown and invalid PURLs separately.
+Invalid PURLs are usually an error of the SBOM or SCA producer tool.
+Unknown PURLs could be new packages, or typos, or SCA tools inventions.
diff --git a/docs/source/index.rst b/docs/source/index.rst
index eb63717..2cfeeef 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -1,16 +1,57 @@
-Welcome to nexb-skeleton's documentation!
-=========================================
+PurlValidator Documentation
+===========================
 
-.. toctree::
-   :maxdepth: 2
-   :caption: Contents:
+PurlValidator checks whether a base Package-URL (PURL) is present in a known
+package catalog. It works without a network connection after installation.
 
-   skeleton-usage
-   contribute/contrib_doc
+A valid PURL string can still name a package that is not known. PurlValidator
+adds this package identity check for SBOM, VEX, SCA, and compliance workflows.
+
+Documentation overview
+----------------------
+
+Getting started
+~~~~~~~~~~~~~~~
+
+- :ref:`quickstart`
+- :ref:`introduction`
+
+Tutorials
+~~~~~~~~~
+
+- :ref:`tutorials`
+
+How-to guides
+~~~~~~~~~~~~~
+
+- :ref:`how_to_guides`
+
+Reference
+~~~~~~~~~
+
+- :ref:`reference`
+
+Explanations
+~~~~~~~~~~~~
+
+- :ref:`explanations`
+- :ref:`data_structure_rationale`
 
 Indices and tables
-==================
+------------------
 
 * :ref:`genindex`
-* :ref:`modindex`
 * :ref:`search`
+
+.. toctree::
+   :maxdepth: 2
+   :hidden:
+
+   quickstart
+   introduction
+   tutorials
+   how-to-guides
+   reference
+   explanations
+   data-structure-rationale
+   contribute/contrib_doc
diff --git a/docs/source/introduction.rst b/docs/source/introduction.rst
new file mode 100644
index 0000000..cffc6df
--- /dev/null
+++ b/docs/source/introduction.rst
@@ -0,0 +1,47 @@
+.. _introduction:
+
+Introduction
+============
+
+PurlValidator checks package identity for Package-URLs (PURLs). It does
+not replace syntax validation. It adds a lookup against an index of packaged
+reference data.
+
+Why this exists?
+-----------------
+
+PURL is used in SBOMs, VEX documents, SCA tools, and vulnerability databases.
+The PURL spec tells tools how to write a package identifier, but does
+not prove that the package exists.
+
+Common PURL data problems include:
+
+- Misspelled package names.
+- Wrong or made up package types.
+- Package that are not present in an ecosystem.
+
+PurlValidator answers this question:
+
+Does this PURL exists for a known package?
+
+Repositories
+------------
+
+We have three implementations in Rust, Go and Python.
+Each repository has language-specific usage notes in its README.
+
+- Python: https://github.com/aboutcode-org/purl-validator
+- Rust: https://github.com/aboutcode-org/purl-validator.rs
+- Go: https://github.com/aboutcode-org/purlvalidator-go
+
+
+Validation scope
+----------------
+
+PurlValidator validates PURLs, ignoring version. A base PURL contains:
+
+- Type, such as ``npm`` or ``pypi``.
+- Optional namespace, such as an npm scope or Maven groupid.
+- Name.
+
+Versions, qualifiers, and subpaths are not part of the lookup query.
diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst
new file mode 100644
index 0000000..80f7198
--- /dev/null
+++ b/docs/source/quickstart.rst
@@ -0,0 +1,84 @@
+.. _quickstart:
+
+Quickstart
+==========
+
+Python
+------
+
+Install the Python package:
+
+.. code-block:: bash
+
+    pip install purl-validator
+
+Validate a PURL:
+
+.. code-block:: python
+
+    from purl_validator import PurlValidator
+
+    validator = PurlValidator()
+
+    print(validator.validate_purl("pkg:nuget/FluentValidation"))
+    print(validator.validate_purl("pkg:nuget/non-existent-foo-bar"))
+
+Rust
+----
+
+Install the Rust crate:
+
+.. code-block:: bash
+
+    cargo add purl_validator
+
+Validate a PURL:
+
+.. code-block:: rust
+
+    use purl_validator::validate;
+
+    fn main() {
+        let exists = validate("pkg:nuget/FluentValidation")
+            .expect("input must be a supported base PURL");
+
+        println!("{exists}");
+    }
+
+Go
+--
+
+Install the Go module:
+
+.. code-block:: bash
+
+    go get github.com/aboutcode-org/purlvalidator-go
+
+Validate a PURL:
+
+.. code-block:: go
+
+    package main
+
+    import (
+        "fmt"
+        "log"
+
+        purlvalidator "github.com/aboutcode-org/purlvalidator-go"
+    )
+
+    func main() {
+        exists, err := purlvalidator.Validate("pkg:nuget/FluentValidation")
+        if err != nil {
+            log.Fatal(err)
+        }
+
+        fmt.Println(exists)
+    }
+
+Next steps
+----------
+
+- Use the Python README for Python-specific helper APIs:  https://github.com/aboutcode-org/purl-validator
+- Use the Rust README for error handling with ``ValidateError``: https://github.com/aboutcode-org/purl-validator.rs
+- Use the Go README for ``Validate`` return values and integration examples: https://github.com/aboutcode-org/purlvalidator-go
diff --git a/docs/source/reference.rst b/docs/source/reference.rst
new file mode 100644
index 0000000..96a8ed9
--- /dev/null
+++ b/docs/source/reference.rst
@@ -0,0 +1,75 @@
+.. _reference:
+
+Reference
+=========
+
+Supported ecosystems
+--------------------
+
+The current validators package indexed reference data for these pacakge types/ecosystems:
+
+- ``apk``
+- ``cargo``
+- ``composer``
+- ``conan``
+- ``cpan``
+- ``cran``
+- ``debian``
+- ``maven``
+- ``npm``
+- ``nuget``
+- ``pypi``
+- ``swift``
+
+Base PURLs
+----------
+
+A base PURL is a Package-URL without a version, qualifiers, or subpath.
+
+Examples:
+
+.. code-block:: text
+
+    pkg:pypi/django
+    pkg:npm/%40angular/core
+    pkg:maven/org.apache.commons/commons-lang3
+
+Unsupported examples:
+
+.. code-block:: text
+
+    pkg:pypi/django@5.0.0
+    pkg:npm/%40angular/core?repository_url=https://registry.npmjs.org
+    pkg:maven/org.apache.commons/commons-lang3#src/main
+
+Implementation summary
+----------------------
+
+- Python uses a memory-mapped compact map through ``ducer.Map``.
+- Rust uses an embedded ``fst::Set`` generated from sorted PURL strings.
+- Go uses an embedded Vellum FST generated from sorted PURL strings.
+
+
+Language APIs
+-------------
+
+Python:
+
+.. code-block:: python
+
+    from purl_validator import PurlValidator
+
+    validator = PurlValidator()
+    exists = validator.validate_purl("pkg:pypi/django")
+
+Rust:
+
+.. code-block:: rust
+
+    let exists = purl_validator::validate("pkg:pypi/django")?;
+
+Go:
+
+.. code-block:: go
+
+    exists, err := purlvalidator.Validate("pkg:pypi/django")
diff --git a/docs/source/skeleton-usage.rst b/docs/source/skeleton-usage.rst
deleted file mode 100644
index 6cb4cc5..0000000
--- a/docs/source/skeleton-usage.rst
+++ /dev/null
@@ -1,160 +0,0 @@
-Usage
-=====
-A brand new project
--------------------
-.. code-block:: bash
-
-    git init my-new-repo
-    cd my-new-repo
-    git pull git@github.com:nexB/skeleton
-
-    # Create the new repo on GitHub, then update your remote
-    git remote set-url origin git@github.com:nexB/your-new-repo.git
-
-From here, you can make the appropriate changes to the files for your specific project.
-
-Update an existing project
----------------------------
-.. code-block:: bash
-
-    cd my-existing-project
-    git remote add skeleton git@github.com:nexB/skeleton
-    git fetch skeleton
-    git merge skeleton/main --allow-unrelated-histories
-
-This is also the workflow to use when updating the skeleton files in any given repository.
-
-Customizing
------------
-
-You typically want to perform these customizations:
-
-- remove or update the src/README.rst and tests/README.rst files
-- set project info and dependencies in setup.cfg
-- check the configure and configure.bat defaults
-
-Initializing a project
-----------------------
-
-All projects using the skeleton will be expected to pull all of it dependencies
-from thirdparty.aboutcode.org/pypi or the local thirdparty directory, using
-requirements.txt and/or requirements-dev.txt to determine what version of a
-package to collect. By default, PyPI will not be used to find and collect
-packages from.
-
-In the case where we are starting a new project where we do not have
-requirements.txt and requirements-dev.txt and whose dependencies are not yet on
-thirdparty.aboutcode.org/pypi, we run the following command after adding and
-customizing the skeleton files to your project:
-
-.. code-block:: bash
-
-    ./configure
-
-This will initialize the virtual environment for the project, pull in the
-dependencies from PyPI and add them to the virtual environment.
-
-
-Generating requirements.txt and requirements-dev.txt
-----------------------------------------------------
-
-After the project has been initialized, we can generate the requirements.txt and
-requirements-dev.txt files.
-
-Ensure the virtual environment is enabled.
-
-.. code-block:: bash
-
-    source venv/bin/activate
-
-To generate requirements.txt:
-
-.. code-block:: bash
-
-    python etc/scripts/gen_requirements.py -s venv/lib/python<version>/site-packages/
-
-Replace \<version\> with the version number of the Python being used, for example:
-``venv/lib/python3.6/site-packages/``
-
-To generate requirements-dev.txt after requirements.txt has been generated:
-
-.. code-block:: bash
-
-    ./configure --dev
-    python etc/scripts/gen_requirements_dev.py -s venv/lib/python<version>/site-packages/
-
-Note: on Windows, the ``site-packages`` directory is located at ``venv\Lib\site-packages\``
-
-.. code-block:: bash
-
-    python .\\etc\\scripts\\gen_requirements.py -s .\\venv\\Lib\\site-packages\\
-    .\configure --dev
-    python .\\etc\\scripts\\gen_requirements_dev.py -s .\\venv\\Lib\\site-packages\\
-
-
-Collecting and generating ABOUT files for dependencies
-------------------------------------------------------
-
-Ensure that the dependencies used by ``etc/scripts/fetch_thirdparty.py`` are installed:
-
-.. code-block:: bash
-
-    pip install -r etc/scripts/requirements.txt
-
-Once we have requirements.txt and requirements-dev.txt, we can fetch the project
-dependencies as wheels and generate ABOUT files for them:
-
-.. code-block:: bash
-
-    python etc/scripts/fetch_thirdparty.py -r requirements.txt -r requirements-dev.txt
-
-There may be issues with the generated ABOUT files, which will have to be
-corrected. You can check to see if your corrections are valid by running:
-
-.. code-block:: bash
-
-    python etc/scripts/check_thirdparty.py -d thirdparty
-
-Once the wheels are collected and the ABOUT files are generated and correct,
-upload them to thirdparty.aboutcode.org/pypi by placing the wheels and ABOUT
-files from the thirdparty directory to the pypi directory at
-https://github.com/aboutcode-org/thirdparty-packages
-
-
-Usage after project initialization
-----------------------------------
-
-Once the ``requirements.txt`` and ``requirements-dev.txt`` have been generated
-and the project dependencies and their ABOUT files have been uploaded to
-thirdparty.aboutcode.org/pypi, you can configure the project as needed, typically
-when you update dependencies or use a new checkout.
-
-If the virtual env for the project becomes polluted, or you would like to remove
-it, use the ``--clean`` option:
-
-.. code-block:: bash
-
-    ./configure --clean
-
-Then you can run ``./configure`` again to set up the project virtual environment.
-
-To set up the project for development use:
-
-.. code-block:: bash
-
-    ./configure --dev
-
-To update the project dependencies (adding, removing, updating packages, etc.),
-update the dependencies in ``setup.cfg``, then run:
-
-.. code-block:: bash
-
-    ./configure --clean # Remove existing virtual environment
-    source venv/bin/activate # Ensure virtual environment is activated
-    python etc/scripts/gen_requirements.py -s venv/lib/python<version>/site-packages/ # Regenerate requirements.txt
-    python etc/scripts/gen_requirements_dev.py -s venv/lib/python<version>/site-packages/ # Regenerate requirements-dev.txt
-    pip install -r etc/scripts/requirements.txt # Install dependencies needed by etc/scripts/bootstrap.py
-    python etc/scripts/fetch_thirdparty.py -r requirements.txt -r requirements-dev.txt # Collect dependency wheels and their ABOUT files
-
-Ensure that the generated ABOUT files are valid, then take the dependency wheels
-and ABOUT files and upload them to thirdparty.aboutcode.org/pypi.
diff --git a/docs/source/tutorials.rst b/docs/source/tutorials.rst
new file mode 100644
index 0000000..fd67722
--- /dev/null
+++ b/docs/source/tutorials.rst
@@ -0,0 +1,45 @@
+.. _tutorials:
+
+Tutorials
+=========
+
+Validate a list of PURLs with Python
+------------------------------------
+
+Create a file named ``purls.txt``:
+
+.. code-block:: text
+
+    pkg:nuget/FluentValidation
+    pkg:nuget/non-existent-foo-bar
+    pkg:pypi/django
+
+Run this script:
+
+.. code-block:: python
+
+    from pathlib import Path
+    from purl_validator import PurlValidator
+
+    validator = PurlValidator()
+
+    for line in Path("purls.txt").read_text().splitlines():
+        purl = line.strip()
+        if not purl:
+            continue
+        print(purl, validator.validate_purl(purl))
+
+
+Use PurlValidator in an SBOM check
+----------------------------------
+
+The basic workflow is:
+
+1. Extract PURLs from an SBOM.
+2. Convert each PURL to its base identity.
+3. Validate each base PURL with one PurlValidator library.
+4. Report unknown PURLs for review.
+
+An unknown PURL may be a typo, a wrong package type, or a package missing from
+the packaged reference data. Handle unknown PURLs according to the policy for
+your project.