Qualytics · RafaelOsiro · Jun 10, 2026 · Jun 10, 2026
diff --git a/docs/data-quality-checks/entity-resolution.md b/docs/data-quality-checks/entity-resolution.md
diff --git a/docs/data-quality-checks/entity-resolution/api.md b/docs/data-quality-checks/entity-resolution/api.md
@@ -0,0 +1,161 @@
+# :material-api:{ .middle style="color: var(--q-brick)" } Entity Resolution API
+
+The Entity Resolution check is created and managed through the standard Quality Checks API by setting `rule` to `entityResolution`. The check is multi-field: rather than listing fields under `fields`, you list one entry per evaluated field under `properties.target_fields` and pick the **distinction field** under `properties.distinct_field_name`. The `fields` array on the check itself is auto-populated from `target_fields` and can be sent as an empty list.
+
+!!! tip
+    For complete API documentation, including request and response schemas, visit the [API docs](https://demo.qualytics.io/api/docs){:target="_blank"}.
+
+## Endpoints
+
+| Method | Path | Purpose |
+|:---|:---|:---|
+| `POST` | `/api/quality-checks` | Create a new Entity Resolution check. |
+| `GET` | `/api/quality-checks/{id}` | Retrieve an Entity Resolution check by ID. |
+| `PUT` | `/api/quality-checks/{id}` | Update an existing Entity Resolution check. |
+| `DELETE` | `/api/quality-checks/{id}` | Delete (or archive) an Entity Resolution check. |
+
+**Permission**: Author (or above) on the target container's team for `POST`, `PUT`, and `DELETE`; Reporter (or above) for `GET`.
+
+## Payload Example
+
+Create a multi-field Entity Resolution check on `full_name` (fuzzy) and `address` (fuzzy), distinguished by `customer_id`, with `POST /api/quality-checks`:
+
+```json
+{
+    "description": "Customers with similar names and addresses must share a customer_id",
+    "rule": "entityResolution",
+    "fields": [],
+    "container_id": 145,
+    "filter": null,
+    "properties": {
+        "distinct_field_name": "customer_id",
+        "composite_match_threshold": 0.75,
+        "target_fields": [
+            {
+                "upickle_type": "StringTargetField",
+                "field_name": "full_name",
+                "match_type": "fuzzy",
+                "pair_substrings": true,
+                "pair_homophones": false,
+                "consider_term_frequency": false,
+                "weight": 1.0
+            },
+            {
+                "upickle_type": "StringTargetField",
+                "field_name": "address",
+                "match_type": "fuzzy",
+                "pair_substrings": false,
+                "pair_homophones": false,
+                "consider_term_frequency": false,
+                "weight": 0.8
+            }
+        ]
+    },
+    "tags": ["pii", "master-data"],
+    "additional_metadata": {"jira": "DATA-4101"},
+    "anomaly_message_field": null,
+    "template_id": null,
+    "status": "Active",
+    "owner_id": 7,
+    "default_anomaly_assignee_id": 12
+}
+```
+
+## Top-Level Field Notes
+
+| Field | Required | Notes |
+|:---|:---:|:---|
+| `description` | Yes | Free-text description shown in the UI. |
+| `rule` | Yes | Must be `"entityResolution"`. |
+| `fields` | Yes | Send `[]`. The list of evaluated fields is computed from `properties.target_fields`. |
+| `container_id` | Yes | ID of the container (table or file) the check runs against. |
+| `filter` | No | Spark SQL `WHERE` expression. Applied **before** entity resolution runs, so only filtered rows are clustered. Send `null` for no filter. |
+| `properties.distinct_field_name` | Yes | Name of the field that must hold a single value within each resolved entity cluster. Accepted types: `Integral`, `Fractional`, `Boolean`, `String`, `Date`, `Timestamp`. |
+| `properties.composite_match_threshold` | Yes | Fractional value between `0.0` and `1.0`. Pairs whose weighted composite score is greater than or equal to this value are treated as matches. Default `0.7`. |
+| `properties.target_fields` | Yes | Non-empty array. Each entry configures one field with its `match_type`, `weight`, and (for strings) optional substring/homophone/term-frequency knobs. See **Target Field Notes** below. |
+| `tags` | No | List of tag names applied to the check for filtering and organization. |
+| `additional_metadata` | No | Free-form key-value pairs (typically links to catalog, tickets, governance records). |
+| `anomaly_message_field` | No | Name of a source-record field whose value should be used as the anomaly message instead of the system-generated one. **Not applicable to Entity Resolution:** because the rule emits only Shape Anomalies (which use a fixed message template), this field is silently ignored. Send `null`. |
+| `template_id` | No | ID of a Check Template to associate the check with. `null` if not using a template. |
+| `status` | No | `"Active"` (default) or `"Draft"`. Draft checks are not evaluated by Scans. |
+| `owner_id` | No | ID of the user who owns the check. Defaults to the user creating the check when omitted. |
+| `default_anomaly_assignee_id` | No | ID of the user automatically assigned to anomalies produced by the check. |
+
+!!! info "Coverage is not supported"
+    Entity Resolution does not accept a `coverage` value. The rule evaluates clusters as compliant or non-compliant; there is no fractional tolerance to set.
+
+## Target Field Notes
+
+Each entry in `target_fields` is one of three shapes, identified by its `upickle_type` discriminator: `"StringTargetField"`, `"NumericTargetField"`, or `"DateTimeTargetField"`. The platform validates that the declared `upickle_type` matches the actual data type of the field on the container.
+
+### String Target Field
+
+```json
+{
+    "upickle_type": "StringTargetField",
+    "field_name": "full_name",
+    "match_type": "fuzzy",
+    "pair_substrings": true,
+    "pair_homophones": false,
+    "consider_term_frequency": false,
+    "weight": 1.0
+}
+```
+
+| Field | Required | Notes |
+|:---|:---:|:---|
+| `upickle_type` | Yes | Must be `"StringTargetField"`. Identifies the shape so the platform can deserialize this entry. |
+| `field_name` | Yes | Name of the string field on the container. |
+| `match_type` | No | `"fuzzy"` (default) or `"exact"`. `exact` turns the field into a blocking pre-filter: pairs disagreeing on this field are never scored. |
+| `pair_substrings` | No | When `true`, a pair where one string contains the other scores `1.0` on this field. Default `false`. Applies only to `fuzzy`. |
+| `pair_homophones` | No | When `true`, a pair whose values sound alike (phonetic similarity) scores `1.0` on this field. Default `false`. Applies only to `fuzzy`. |
+| `consider_term_frequency` | No | When `true`, rare tokens carry more weight than common tokens. Default `false`. Applies only to `fuzzy`. |
+| `weight` | No | Non-negative number. Controls this field's contribution to the composite score. Default `1.0`. Ignored when `match_type` is `exact`. |
+
+### Numeric Target Field
+
+```json
+{
+    "upickle_type": "NumericTargetField",
+    "field_name": "phone_number",
+    "match_type": "absolute",
+    "offset": 0.0,
+    "weight": 1.0
+}
+```
+
+| Field | Required | Notes |
+|:---|:---:|:---|
+| `upickle_type` | Yes | Must be `"NumericTargetField"`. Identifies the shape so the platform can deserialize this entry. |
+| `field_name` | Yes | Name of the numeric field (Integral or Fractional) on the container. |
+| `match_type` | No | `"absolute"` (default), `"relative"`, or `"exact"`. `"absolute"` compares with a fixed `offset`; `"relative"` compares with a percentage tolerance (e.g. `0.05` for 5%); `"exact"` turns the field into a blocking pre-filter. |
+| `offset` | No | Non-negative numeric tolerance. With `match_type: "absolute"`, the pair scores `1.0` if `|a − b| ≤ offset`, otherwise `0.0`. With `match_type: "relative"`, the value is interpreted as a fraction (e.g. `0.05` for 5%). Default `0.0`. |
+| `weight` | No | Non-negative number controlling contribution to the composite. Default `1.0`. Ignored when `match_type` is `exact`. |
+
+### Datetime Target Field
+
+```json
+{
+    "upickle_type": "DateTimeTargetField",
+    "field_name": "registered_at",
+    "match_type": "offset",
+    "offset_seconds": 3600,
+    "weight": 1.0
+}
+```
+
+| Field | Required | Notes |
+|:---|:---:|:---|
+| `upickle_type` | Yes | Must be `"DateTimeTargetField"`. Identifies the shape so the platform can deserialize this entry. |
+| `field_name` | Yes | Name of the Date or Timestamp field on the container. |
+| `match_type` | No | `"offset"` (default), `"granularity"`, or `"exact"`. `"offset"` compares within a number of seconds; `"granularity"` compares whether both timestamps fall in the same bucket; `"exact"` turns the field into a blocking pre-filter. |
+| `offset_seconds` | No | Non-negative integer tolerance in seconds. Applies when `match_type` is `"offset"`: the pair scores `1.0` if the two timestamps are within `offset_seconds` of each other. Default `0`. |
+| `granularity` | No | Bucket applied before comparison. Applies when `match_type` is `"granularity"`. Accepted values: `"Day"`, `"Week"`, `"Month"`, `"Year"`. Omit (or send `null`) when `match_type` is not `"granularity"`. |
+| `weight` | No | Non-negative number controlling contribution to the composite. Default `1.0`. Ignored when `match_type` is `exact`. |
+
+## Related
+
+- [Introduction](introduction.md){:target="_blank"}: formal definition, target field types, field scope, and general/anomaly properties.
+- [How It Works](how-it-works.md){:target="_blank"}: full semantics, clustering behavior, threshold tuning, and source-records behavior.
+- [Examples](examples.md){:target="_blank"}: three production scenarios with sample data, source records, and resulting anomalies.
+- [FAQ](faq.md){:target="_blank"}: short answers to the most frequent questions.