From e57a291eff9e7ee04c8789d3fb98274854412987 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Tue, 15 Jul 2025 12:26:21 +0100 Subject: [PATCH 01/13] Add Important Labels subsection, with job and instance called out Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index 4999feb5d..1ad1625da 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -80,7 +80,18 @@ the underlying metric type and unit you work with. * **Metric collisions**: With growing adoption and metric changes over time, there are cases where lack of unit and type information in the metric name will cause certain series to collide (e.g. `process_cpu` for seconds and milliseconds). -## Labels +### Important Labels + +* `job` + * The `job` label is a primary key to differentiate metrics from eaach other. + * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation + +WARNING: When using `without`, be careful not to strip out the `job` label accidentally. + +* `instance` + * The `instance` label will include the `ip:port` what was scraped, providing a crucial breadcrumb for debugging scrape time issues + +### General Labelling Advice Use labels to differentiate the characteristics of the thing that is being measured: From a40367a3df6e898a1df7bad47994d3babd6a4072 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Tue, 15 Jul 2025 13:34:11 +0100 Subject: [PATCH 02/13] Add a new section about labels Signed-off-by: Conall O'Brien --- docs/practices/rules.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/docs/practices/rules.md b/docs/practices/rules.md index f91fef5eb..eebe880ed 100644 --- a/docs/practices/rules.md +++ b/docs/practices/rules.md @@ -19,6 +19,8 @@ This page documents proper naming conventions and aggregation for recording rule Keeping the metric name unchanged makes it easy to know what a metric is and easy to find in the codebase. +IMPORTANT: `job` label acts as a primary key. It is **strongly** recommended that you use it to scope your PromQL expressions to the system you are monitoring. + To keep the operations clean, `_sum` is omitted if there are other operations, as `sum()`. Associative operations can be merged (for example `min_min` is the same as `min`). @@ -27,6 +29,18 @@ If there is no obvious operation to use, use `sum`. When taking a ratio by doing division, separate the metrics using `_per_` and call the operation `ratio`. +## Labels + +NOTE: Omitting a label in a PromQL expression is the functional equivalent of specifying `label=*` + +* In both recorded rules and alerting expressions, always specify a `job` label to prevent expression mismatches from occuring. + This is especially important in multi-tenant systems where the same metric names may be exported by different jobs or the + same job (e.g `node_exporter) in multiple, distinct deployments + +* Always specify a `without` clause with the labels you are aggregating away. +This is to preserve all the other labels such as `job`, which will avoid +conflicts and give you more useful metrics and alerts. + ## Aggregation * When aggregating up ratios, aggregate up the numerator and denominator @@ -40,10 +54,6 @@ Instead keep the metric name without the `_count` or `_sum` suffix and replace the `rate` in the operation with `mean`. This represents the average observation size over that time period. -* Always specify a `without` clause with the labels you are aggregating away. -This is to preserve all the other labels such as `job`, which will avoid -conflicts and give you more useful metrics and alerts. - ## Examples _Note the indentation style with outdented operators on their own line between From ccf009af0898d2c2f3d16864c1a6a43cbd081a38 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Thu, 14 Aug 2025 12:08:39 +0100 Subject: [PATCH 03/13] Fix typo Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index 1ad1625da..d17b13b0f 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -83,7 +83,7 @@ of unit and type information in the metric name will cause certain series to col ### Important Labels * `job` - * The `job` label is a primary key to differentiate metrics from eaach other. + * The `job` label is a primary key to differentiate metrics from each other. * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation WARNING: When using `without`, be careful not to strip out the `job` label accidentally. From 0e87e09e4b65735a059e8bcd261a9bb8dbecce1a Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Thu, 14 Aug 2025 14:07:39 +0100 Subject: [PATCH 04/13] Update docs/practices/naming.md Co-authored-by: Ben Kochie Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index d17b13b0f..485f65bb4 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -80,7 +80,7 @@ the underlying metric type and unit you work with. * **Metric collisions**: With growing adoption and metric changes over time, there are cases where lack of unit and type information in the metric name will cause certain series to collide (e.g. `process_cpu` for seconds and milliseconds). -### Important Labels +### Labels * `job` * The `job` label is a primary key to differentiate metrics from each other. From a79aa9b2fb2029c171815b4128f3d763f0117e8c Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Fri, 15 Aug 2025 11:54:05 +0100 Subject: [PATCH 05/13] Update docs/practices/naming.md Co-authored-by: Ben Kochie Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index 485f65bb4..a337c640f 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -80,7 +80,7 @@ the underlying metric type and unit you work with. * **Metric collisions**: With growing adoption and metric changes over time, there are cases where lack of unit and type information in the metric name will cause certain series to collide (e.g. `process_cpu` for seconds and milliseconds). -### Labels +## Labels * `job` * The `job` label is a primary key to differentiate metrics from each other. From a0ebfa9beff85909523b35437ac527f787115edc Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Sun, 17 Aug 2025 20:42:46 +0100 Subject: [PATCH 06/13] Iterate on the job label description Iterate on the description of the job label, removing "primary key", given it's association with SQL Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index a337c640f..9fe727172 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -83,7 +83,7 @@ of unit and type information in the metric name will cause certain series to col ## Labels * `job` - * The `job` label is a primary key to differentiate metrics from each other. + * The `job` label is one of the few ubiquitious labels, set at scrape time, and is used to identify metrics scraped from the same target/exporter. * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation WARNING: When using `without`, be careful not to strip out the `job` label accidentally. From 769452372a75a82d8d5c6fa82e450ba3fdc257f5 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Tue, 7 Oct 2025 16:59:07 +0100 Subject: [PATCH 07/13] Update docs/practices/naming.md Co-authored-by: Ben Kochie Signed-off-by: Conall O'Brien --- docs/practices/naming.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/practices/naming.md b/docs/practices/naming.md index 9fe727172..49a37f4c5 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -82,6 +82,10 @@ of unit and type information in the metric name will cause certain series to col ## Labels +Prometheus labels can come from both the target and from [relabeling in discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) as well as from the target itself. + +By default Prometheus configures two primary discovery target labels. + * `job` * The `job` label is one of the few ubiquitious labels, set at scrape time, and is used to identify metrics scraped from the same target/exporter. * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation From 1cb09cdde98ec253bd91fe9a6df923291cd40339 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Sat, 25 Apr 2026 00:23:31 +0100 Subject: [PATCH 08/13] Split labels from naming.md into a new labels.md Signed-off-by: Conall O'Brien --- docs/practices/labels.md | 43 ++++++++++++++++++++++++++++++++++++++++ docs/practices/naming.md | 35 ++------------------------------ 2 files changed, 45 insertions(+), 33 deletions(-) create mode 100644 docs/practices/labels.md diff --git a/docs/practices/labels.md b/docs/practices/labels.md new file mode 100644 index 000000000..9ff5800c0 --- /dev/null +++ b/docs/practices/labels.md @@ -0,0 +1,43 @@ +--- +title: Label Best Practices +sort_rank: 1 +--- + +The label conventions presented in this document are not required +for using Prometheus, but can serve as both a style-guide and a collection of +best practices. Individual organizations may want to approach some of these +practices, e.g. naming conventions, differently. + +## Labels + +Prometheus labels can come from both the target and from +[relabeling in discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) as well as from the target itself. + +By default Prometheus configures two primary discovery target labels. + +* `job` + * The `job` label is one of the few ubiquitious labels, set at scrape time, and is + used to identify metrics scraped from the same target/exporter. + * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation + +WARNING: When using `without`, be careful not to strip out the `job` label accidentally. + +* `instance` + * The `instance` label will include the `ip:port` what was scraped, providing a + crucial breadcrumb for debugging scrape time issues + +### General Labelling Advice + +Use labels to differentiate the characteristics of the thing that is being measured: + + * `api_http_requests_total` - differentiate request types: `operation="create|update|delete"` + * `api_request_duration_seconds` - differentiate request stages: `stage="extract|transform|load"` + +Do not put the label names in the metric name, as this introduces redundancy +and will cause confusion if the respective labels are aggregated away. + +CAUTION: Remember that every unique combination of key-value label +pairs represents a new time series, which can dramatically increase the amount +of data stored. Do not use labels to store dimensions with high cardinality +(many different label values), such as user IDs, email addresses, or other +unbounded sets of values. diff --git a/docs/practices/naming.md b/docs/practices/naming.md index c4a71b6ce..a1dc0cc96 100644 --- a/docs/practices/naming.md +++ b/docs/practices/naming.md @@ -1,9 +1,9 @@ --- -title: Metric and label naming +title: Metric naming sort_rank: 1 --- -The metric and label conventions presented in this document are not required +The metric conventions presented in this document are not required for using Prometheus, but can serve as both a style-guide and a collection of best practices. Individual organizations may want to approach some of these practices, e.g. naming conventions, differently. @@ -80,37 +80,6 @@ the underlying metric type and unit you work with. * **Metric collisions**: With growing adoption and metric changes over time, there are cases where lack of unit and type information in the metric name will cause certain series to collide (e.g. `process_cpu` for seconds and milliseconds). -## Labels - -Prometheus labels can come from both the target and from [relabeling in discovery](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config) as well as from the target itself. - -By default Prometheus configures two primary discovery target labels. - -* `job` - * The `job` label is one of the few ubiquitious labels, set at scrape time, and is used to identify metrics scraped from the same target/exporter. - * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation - -WARNING: When using `without`, be careful not to strip out the `job` label accidentally. - -* `instance` - * The `instance` label will include the `ip:port` what was scraped, providing a crucial breadcrumb for debugging scrape time issues - -### General Labelling Advice - -Use labels to differentiate the characteristics of the thing that is being measured: - - * `api_http_requests_total` - differentiate request types: `operation="create|update|delete"` - * `api_request_duration_seconds` - differentiate request stages: `stage="extract|transform|load"` - -Do not put the label names in the metric name, as this introduces redundancy -and will cause confusion if the respective labels are aggregated away. - -CAUTION: Remember that every unique combination of key-value label -pairs represents a new time series, which can dramatically increase the amount -of data stored. Do not use labels to store dimensions with high cardinality -(many different label values), such as user IDs, email addresses, or other -unbounded sets of values. - ## Base Units Prometheus does not have any units hard coded. For better compatibility, base From 62bf3d7fa76bd492fe38c06570381059e646956c Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Sat, 25 Apr 2026 00:47:10 +0100 Subject: [PATCH 09/13] Manually apply suggested edits from @SuperQ Signed-off-by: Conall O'Brien --- docs/practices/labels.md | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/docs/practices/labels.md b/docs/practices/labels.md index 9ff5800c0..3b1ed4cd7 100644 --- a/docs/practices/labels.md +++ b/docs/practices/labels.md @@ -15,23 +15,22 @@ Prometheus labels can come from both the target and from By default Prometheus configures two primary discovery target labels. -* `job` - * The `job` label is one of the few ubiquitious labels, set at scrape time, and is - used to identify metrics scraped from the same target/exporter. - * If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation +- `job` + - The `job` label is one of the few ubiquitious labels, set at scrape time, and is + used to identify metrics scraped from the same target/exporter. + - If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation WARNING: When using `without`, be careful not to strip out the `job` label accidentally. -* `instance` - * The `instance` label will include the `ip:port` what was scraped, providing a - crucial breadcrumb for debugging scrape time issues +- `instance` + - The `instance` label will include the `ip:port` what was scraped ### General Labelling Advice Use labels to differentiate the characteristics of the thing that is being measured: - * `api_http_requests_total` - differentiate request types: `operation="create|update|delete"` - * `api_request_duration_seconds` - differentiate request stages: `stage="extract|transform|load"` +- `api_http_requests_total` - differentiate request types: `operation="create|update|delete"` +- `api_request_duration_seconds` - differentiate request stages: `stage="extract|transform|load"` Do not put the label names in the metric name, as this introduces redundancy and will cause confusion if the respective labels are aggregated away. From 675d41f070d8810c015abbd4e466a9c97933c748 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Sat, 25 Apr 2026 11:41:22 +0100 Subject: [PATCH 10/13] Set sort_rank to 2 Signed-off-by: Conall O'Brien --- docs/practices/labels.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/labels.md b/docs/practices/labels.md index 3b1ed4cd7..6916b610d 100644 --- a/docs/practices/labels.md +++ b/docs/practices/labels.md @@ -1,6 +1,6 @@ --- title: Label Best Practices -sort_rank: 1 +sort_rank: 2 --- The label conventions presented in this document are not required From 092cc8926579e6dacab2d0865df3b20501be0faa Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Sat, 25 Apr 2026 11:42:15 +0100 Subject: [PATCH 11/13] Rename title to just Labels Signed-off-by: Conall O'Brien --- docs/practices/labels.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/practices/labels.md b/docs/practices/labels.md index 6916b610d..62950b65b 100644 --- a/docs/practices/labels.md +++ b/docs/practices/labels.md @@ -1,5 +1,5 @@ --- -title: Label Best Practices +title: Labels sort_rank: 2 --- From 0a7611fe9b8d22f4979edce318d761140066396c Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Mon, 27 Apr 2026 17:25:37 +0100 Subject: [PATCH 12/13] Wordsmith the IMPORTANT block to describe job as a scoping label, not a primary key Signed-off-by: Conall O'Brien --- docs/practices/rules.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/practices/rules.md b/docs/practices/rules.md index 037446910..31affe1ce 100644 --- a/docs/practices/rules.md +++ b/docs/practices/rules.md @@ -19,7 +19,8 @@ This page documents proper naming conventions and aggregation for recording rule Keeping the metric name unchanged makes it easy to know what a metric is and easy to find in the codebase. -IMPORTANT: `job` label acts as a primary key. It is **strongly** recommended that you use it to scope your PromQL expressions to the system you are monitoring. +IMPORTANT: `job` label is used to scope a PromQL to a specific service/exporter. It is **strongly** recommended that you +always set it, in order to scope your PromQL expressions to the system you are monitoring. To keep the operations clean, `_sum` is omitted if there are other operations, as `sum()`. Associative operations can be merged (for example `min_min` is the From 4da0a333145db583ba225b2e4172be9990471151 Mon Sep 17 00:00:00 2001 From: Conall O'Brien Date: Mon, 27 Apr 2026 17:27:44 +0100 Subject: [PATCH 13/13] Integrate PR review suggestion Signed-off-by: Conall O'Brien --- docs/practices/labels.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/docs/practices/labels.md b/docs/practices/labels.md index 62950b65b..5bf334bba 100644 --- a/docs/practices/labels.md +++ b/docs/practices/labels.md @@ -16,8 +16,7 @@ Prometheus labels can come from both the target and from By default Prometheus configures two primary discovery target labels. - `job` - - The `job` label is one of the few ubiquitious labels, set at scrape time, and is - used to identify metrics scraped from the same target/exporter. + - The `job` is a default target label set by the scrape configs and is used to identify metrics scraped from the same target/exporter. - If not specified in PromQL expressions, they will match unrelated metrics with the same name. This is especially true in a multi system or multi tenant installation WARNING: When using `without`, be careful not to strip out the `job` label accidentally.