Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ that explains how to use that feature.
- [`log`](./log): Control what the router logs and how it formats log messages.
- [`override_subgraph_urls`](./override_subgraph_urls): Route requests to different
subgraph URLs dynamically.
- [`persisted_documents`](./persisted_documents): Extract and resolve persisted
document IDs from file or Hive storage.
- [`override_labels`](./override_labels): Dynamically activate or deactivate
progressive override labels.
- [`query_planner`](./query_planner): Add safety limits and debugging for query
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
---
title: "persisted_documents"
---

The `persisted_documents` configuration controls how Hive Router reads persisted document IDs and
maps them to GraphQL operation text.

This is the same concept some tools call **persisted queries** or **trusted documents**.

For usage patterns and migration guidance, see
[Persisted Documents guide](/docs/router/security/persisted-documents).

## Options

### `enabled`

Enables persisted document extraction and resolution.
Type is `boolean` and default is `false`.

### `require_id`

If `true`, requests must contain a resolvable persisted document ID. If Router cannot extract an
ID, it returns `PERSISTED_DOCUMENT_ID_REQUIRED`. Type is `boolean` and default is `false`.

### `log_missing_id`

Logs requests that do not provide a resolvable persisted document ID. Type is `boolean` and default
is `false`.

### `selectors`

Ordered list of selectors. Router applies them top-to-bottom and uses the first successful match.
Type is `array`. If omitted, router uses built-in defaults.

If omitted, defaults are:

1. `json_path: documentId`
2. `json_path: extensions.persistedQuery.sha256Hash`

When `enabled: true`, `selectors` cannot be an explicit empty list.

#### `selectors[].type: json_path`

| Field | Type | Required | Notes |
| ------ | -------- | -------- | ------------------------------------------------------------------ |
| `path` | `string` | yes | Dot-path lookup in GraphQL request payload (for example `doc_id`). |

#### `selectors[].type: url_query_param`

| Field | Type | Required | Notes |
| ------ | -------- | -------- | ----------------------------------------- |
| `name` | `string` | yes | Query parameter name to read document ID. |

#### `selectors[].type: url_path_param`

| Field | Type | Required | Notes |
| ---------- | -------- | -------- | --------------------------------------------------------------------------- |
| `template` | `string` | yes | Relative template with exactly one `:id` segment (for example `/docs/:id`). |

Template rules:

- must start with `/`
- must contain exactly one `:id`
- supports `*` wildcard segments
- does not support `**`
- cannot contain query strings or fragments

## `storage`

Selects where persisted document text is loaded from.
Type is `object` and it is required when `enabled: true`.

### `storage.type: file`

| Field | Type | Default | Required | Notes |
| ------- | --------- | ------- | -------- | -------------------------------- |
| `path` | `path` | - | yes | Manifest file path. |
| `watch` | `boolean` | `true` | no | Reload manifest on file changes. |

### `storage.type: hive`

| Field | Type | Default | Required | Notes |
| ---------------------- | ------------------------- | ------- | -------- | -------------------------------------------------------- |
| `endpoint` | `string \| string[]` | - | yes | Hive CDN endpoint(s). Can also use `HIVE_CDN_ENDPOINT`. |
| `key` | `string` | - | yes | Hive CDN key. Can also use `HIVE_CDN_KEY`. |
| `accept_invalid_certs` | `boolean` | `false` | no | Accept invalid TLS certificates. |
| `connect_timeout` | `duration` | `5s` | no | Connection timeout for CDN requests. |
| `request_timeout` | `duration` | `15s` | no | Full request timeout for CDN requests. |
| `retry_policy` | `object` | - | no | Retry policy for CDN fetches. |
| `cache_size` | `integer` | `10000` | no | In-memory persisted document cache size. |
| `circuit_breaker` | `object` | - | no | Circuit breaker configuration for CDN requests. |
| `negative_cache` | `false \| true \| object` | `true` | no | Cache non-2xx misses to reduce repeated failing lookups. |

You can also configure Hive CDN connection using `HIVE_CDN_ENDPOINT` and `HIVE_CDN_KEY`
environment variables.

```bash
HIVE_CDN_ENDPOINT="https://cdn.graphql-hive.com/..."
HIVE_CDN_KEY="your-cdn-key"
```

#### `storage.hive.retry_policy`

| Field | Type | Default | Notes |
| ------------- | --------- | ------- | ---------------------------- |
| `max_retries` | `integer` | `3` | Exponential backoff retries. |

#### `storage.hive.circuit_breaker`

| Field | Type | Default | Notes |
| ------------------ | ---------- | ------- | ------------------------------------------------ |
| `error_threshold` | `number` | `0.5` | Error ratio to open breaker. |
| `volume_threshold` | `integer` | `5` | Minimum request volume before threshold applies. |
| `reset_timeout` | `duration` | `10s` | Time before half-open probe. |

#### `storage.hive.negative_cache`

Supports three forms:

`false` disables negative cache, `true` enables it with defaults, and object form enables it with
custom settings.

| Field | Type | Default | Notes |
| ----- | ---------- | ------- | ------------------------------ |
| `ttl` | `duration` | `5s` | Negative cache entry lifetime. |

## Endpoint compatibility note

If any extractor uses `url_path_param`, `http.graphql_endpoint` cannot be `"/"`.

Use a non-root endpoint such as `/graphql`.

## Examples

### File storage with default selectors

```yaml title="router.config.yaml"
persisted_documents:
enabled: true
require_id: true
storage:
type: file
path: ./persisted-documents.json
```

### Custom selectors order

```yaml title="router.config.yaml"
persisted_documents:
enabled: true
require_id: true
selectors:
- type: url_path_param
template: /docs/:id
- type: url_query_param
name: id
- type: json_path
path: doc_id
storage:
type: file
path: ./persisted-documents.json
```

### Hive storage with custom negative cache TTL

```yaml title="router.config.yaml"
persisted_documents:
enabled: true
require_id: true
storage:
type: hive
endpoint: ${HIVE_CDN_ENDPOINT}
key: ${HIVE_CDN_KEY}
negative_cache:
ttl: 10s
```
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"csrf",
"introspection",
"jwt-authentication",
"operation-complexity"
"operation-complexity",
"persisted-documents"
]
}
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,12 @@ query {

There are a few ways to mitigate this risk which is covered by this documentation.

{/* TODO: Persisted Operations */}
Another strong protection is **Persisted Documents** (also known as persisted operations or trusted
documents). With persisted documents, clients send a document ID instead of arbitrary query text,
and the router resolves only pre-registered operations.

This reduces parsing work and helps block unknown queries before execution. Learn more in
[Persisted Documents](/docs/router/security/persisted-documents).

## Reject operations based on the size / tokens

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
---
title: "Persisted Documents"
---

Persisted documents let clients run pre-registered GraphQL operations by ID, without sending the
full query text on every request.

This reduces payload size, limits arbitrary query execution, and helps you move to queryless
clients.

For the complete configuration reference, see
[`persisted_documents` configuration](/docs/router/configuration/persisted_documents).

## Terminology

Different ecosystems use different names for the same pattern, including **Persisted Queries**,
**Trusted Documents**, or **Operation allowlisting**. In Hive Router docs, we use **Persisted Documents**
as the canonical term.

## Why use persisted documents

When persisted documents are enabled, clients send a document ID. Hive Router then loads the query
text from configured storage. This reduces request payloads, blocks unregistered operations when
`require_id: true`, and gives you a clear migration path from text queries to an allowlist model.

## How document ID selection works

Router applies selectors in order and uses the first matching ID.

If you do not set `selectors`, Router first checks `documentId` (request body for POST, query
parameter for GET), then `extensions.persistedQuery.sha256Hash` (Apollo format). You can customize
this order and use
`json_path` for custom JSON fields, `url_query_param` for query strings such as `?id=...`, or
`url_path_param` for URL path segments such as `/graphql/:id`.

```yaml title="router.config.yaml"
persisted_documents:
enabled: true
require_id: true
selectors:
- type: url_path_param
template: /:id # relative to the graphql endpoint, so effectively /graphql/:id
- type: url_query_param
name: id
storage:
type: file
path: ./persisted-documents.json
```

## Storage

Hive Router supports two persisted document storage types.

### File storage

File storage reads a local manifest and resolves IDs from memory. It supports both simple key-value
manifests (`{ "id": "query" }`) and Apollo manifest format. File **watch mode** is enabled by
default (`watch: true`), so changes are reloaded automatically, which works well with workflows
like `relay-compiler --watch`.

### Hive CDN storage

Hive storage resolves documents from Hive CDN.

You can configure Hive credentials in `router.config.yaml` or with the `HIVE_CDN_ENDPOINT` and
`HIVE_CDN_KEY` environment variables.

```bash
HIVE_CDN_ENDPOINT="https://cdn.graphql-hive.com/artifacts/v1/<target_id>"
HIVE_CDN_KEY="<cdn access token>"
```

Router accepts either full app-qualified IDs (`appName~appVersion~documentId`) or a plain
`documentId`, in which case app name and version are inferred from [client identification headers](/docs/router/configuration/telemetry#client_identification).

If you use Apollo Client [`clientAwareness`](https://www.apollographql.com/docs/react/api/link/apollo-link-client-awareness#configuring-with-apollo-client) or [Apollo Kotlin client awareness](https://www.apollographql.com/docs/kotlin/advanced/client-awareness), set the name and version headers to Apollo's default header names, then send only a plain document ID:

```yaml title="router.config.yaml"
# Client identification settings
telemetry:
client_identification:
name_header: apollographql-client-name
version_header: apollographql-client-version
```

ID format is validated before requests are sent to CDN, so malformed IDs fail fast with clear errors.

## Rejecting requests without document ID

With `require_id: false` (default), regular GraphQL requests (with `query`) are still allowed.
With `require_id: true`, incoming requests must provide a persisted document ID.
During migration, `log_missing_id: true` helps you find requests that still arrive without an ID.

## Path selector and GraphQL endpoint compatibility

If you use `url_path_param`, do not use root GraphQL endpoint (`http.graphql_endpoint: "/"`).

At the root endpoint, path matching is ambiguous (for example `/health` could be interpreted as a
document path). Router rejects that configuration on startup.

## Practical patterns

Apollo clients generally work with default selectors by sending
`extensions.persistedQuery.sha256Hash`. Relay-style clients typically send `documentId` and use a
key-value manifest. URL-driven systems can use `url_path_param` for path-based IDs (for example
CDN-like routes) or `url_query_param` for legacy query-string formats.

## Troubleshooting

- `PERSISTED_DOCUMENT_ID_REQUIRED` - no valid ID was extracted - check selector order and where the request sends the ID
- `PERSISTED_DOCUMENT_NOT_FOUND` - ID was extracted but no matching document exists in storage; verify the ID exists in the manifest or CDN
- Hive client identity errors - plain `document ID` was provided without both client name and version - set client identification headers or send an app-qualified ID
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
title: Persisted Documents in Hive Router
description:
Persisted documents in Hive Router with configurable ID selectors, file and Hive CDN storage, and
allowlist enforcement options.
date: 2026-04-17
authors: [kamil]
---

import { Callout } from "@hive/design-system/hive-components/callout";

[Hive Router](/docs/router) includes support for **Persisted Documents**. Clients send a document ID, and the router resolves the GraphQL document from configured storage. This pattern is also commonly referred to as trusted documents, persisted operations, or persisted queries.

Regardless of the name, the idea behind Persisted Documents is to reduce request payload size, limit arbitrary operation execution, and support migration toward allowlisted traffic. This approach is especially useful for GraphQL APIs used by first-party web and mobile clients, where arbitrary third-party queries are not expected.

<Callout type="info">

Hive Router supports [App Deployments](/docs/schema-registry/app-deployments) out of the box.

When you run a schema check that detects breaking changes, Hive automatically identifies which active app deployments would be affected by those changes. This helps you understand the real-world impact of schema changes before deploying them.

</Callout>

## Accept Document IDs from Multiple Sources

Router can extract document IDs from multiple request locations using ordered selectors, and uses the first successful match.

| Selector | Use case |
| ----------------- | ------------------------------------------------------------ |
| `json_path` | Read IDs from custom JSON payload fields |
| `url_query_param` | Read IDs from query-string parameters |
| `url_path_param` | Read IDs from URL path segments (for example `/graphql/:id`) |

By default, Hive Router checks:

1. `documentId`
2. `extensions.persistedQuery.sha256Hash` (Apollo format)

In practice, this means Router first looks for a top-level `documentId` in the request (for example,
`POST {"documentId":"123456"}` or `GET /graphql?documentId=123456`). If it is missing, Router
falls back to Apollo's persisted query format in `extensions.persistedQuery.sha256Hash`.

You can customize selector order and extraction paths in `router.config.yaml`:

```yaml title="router.config.yaml"
persisted_documents:
enabled: true
selectors:
- type: url_path_param
template: /docs/:id # GET /graphql/docs/123456
- type: url_query_param
name: id # GET /graphql?id=123456
- type: json_path
path: doc_id # POST {"doc_id":"123456"}
storage:
type: file
path: ./persisted-documents.json
```

Hive Router evaluates selectors from top to bottom and uses the first
successful match.

## Resolve Documents from File or Hive CDN

Hive Router can load persisted documents from two sources:

- **File storage** - loads manifest entries into memory and supports both key-value (Relay) and Apollo Persisted Queries manifest formats
- **Hive CDN storage** - resolves IDs from Hive CDN with configurable retries, timeouts, circuit breaker, and negative cache

File storage enables watch mode (`watch: true`) by default, so manifest updates can be reloaded automatically in local workflows. This is especially useful when working with Relay Compiler in watch mode.

---

- [Persisted Documents guide](/docs/router/security/persisted-documents)
- [`persisted_documents` configuration reference](/docs/router/configuration/persisted_documents)
Loading