Conversation
| The otelcol charms deploy `node_exporter` as a singleton snap in a given machine | ||
| However, multiple principal charms may be co-located on the same machine. | ||
| This document shows how to correlate between node-exporter metrics and co-located charms. |
There was a problem hiding this comment.
| The otelcol charms deploy `node_exporter` as a singleton snap in a given machine | |
| However, multiple principal charms may be co-located on the same machine. | |
| This document shows how to correlate between node-exporter metrics and co-located charms. | |
| The OpenTelemetry Collector (`otelcol`) charms deploy `node-exporter` as a singleton snap in a given machine. However, multiple principal charms may be co-located on the same machine. | |
| <Insert 1: What does this info mean / why is the default behavior confusing?> | |
| <Insert 2: Why does the user need to know this?> | |
| This document describes how to <Insert 3: Do what?> |
Explanation of my suggestions
Basic
- I'd recommend first fully naming OpenTelemetry Collector, but it's also acceptable to call it
otelcol(if you think that's obvious to your user base), but still it should be code-formatted in all instances ("Theotelcolcharms deploy...") - or I guess you could do Otelcol (capital O, no codeblock), but that looks more odd IMO - It's a hyphen (based on your repo - this should be formatted consistently in all instances -
node-exporter
Format
The original format isn't clear where the doc is going and why - why does the user want to "correlate between node-exporter metrics and co-located charms"? (i.e., why would a user find this doc and do this?)
This is really important though, not only to frame the rest of the guide well, but it also helps confirm to the user they're in the right location at all (e.g., even if the rest of the doc sucks, you would at least know you were in the right place / know if the doc did or didn't resolve your issue)
I added some placeholders, but the format I was going for is:
{current system behavior}
{why that behavior is confusing}
{why resolving this matters for users}
{what this document provides}
So a re-written version would be like (but change the wording as necessary or if I've misunderstood something!)
"
The OpenTelemetry Collector (otelcol) charms deploy node-exporter as a singleton snap in a given machine. Additionally, multiple principal charms may be co-located on the same machine.
When node-exporter metrics are forwarded by otelcol, they include labels that identify the machine where the metrics were collected. Since these labels are shared by all charms running on that machine, the metrics don't directly indicate which charm produced the specific metric.
To understand which charm is responsible for a specific metric, you need to correlate node-exporter metrics with the charms running on the same machine.
This document describes how to perform that correlation.
"
| This document shows how to correlate between node-exporter metrics and co-located charms. | ||
|
|
||
| ## Manually, via label inspection | ||
| A node-exporter metric such as `node_cpu_seconds_total`, is forwarded by otelcol with labels `juju_model`, `juju_model_uuid` and `instance`, all of which are common to otelcol itself and any co-located charms. The `juju_charm` and `juju_application` labels for node-exporter metrics would have otelcol information. |
There was a problem hiding this comment.
| A node-exporter metric such as `node_cpu_seconds_total`, is forwarded by otelcol with labels `juju_model`, `juju_model_uuid` and `instance`, all of which are common to otelcol itself and any co-located charms. The `juju_charm` and `juju_application` labels for node-exporter metrics would have otelcol information. | |
| A `node-exporter` metric, such as `node_cpu_seconds_total`, is forwarded by `otelcol` with labels including: `juju_model`, `juju_model_uuid` and `instance`. These labels are common to `otelcol` itself and any co-located charms. | |
| The `juju_model` and `juju_model_uuid` labels identify the Juju model where the metric was collected. The `instance` label identifies the specific machine within that model where the metric was collected. |
Explanation
I found it confusing that 3 labels were introduced, but 2 of them basically disappeared (and were "replaced" by two other ones - also it's easy to gloss over and not notice juju_charm and juju_application were introduced, and aren't the same juju_* ones just mentioned with instance).
So note that I removed the reference to juju_charm and juju_application - do these need to be here? IMO it seems clear from the example that has otelcol info
| ## Manually, via label inspection | ||
| A node-exporter metric such as `node_cpu_seconds_total`, is forwarded by otelcol with labels `juju_model`, `juju_model_uuid` and `instance`, all of which are common to otelcol itself and any co-located charms. The `juju_charm` and `juju_application` labels for node-exporter metrics would have otelcol information. | ||
|
|
||
| Note the `instance` label. For example, in the following node-exporter metric, the instance is `juju-b2b564-0.lxd`: |
There was a problem hiding this comment.
| Note the `instance` label. For example, in the following node-exporter metric, the instance is `juju-b2b564-0.lxd`: | |
| For example, in the following `node-exporter` metric: |
(this is tied to my next suggestion but) I've made it more lightweight - IMO giving the specific ID feels easier to understand if presented after the example instead, because it's unnecessary to hold that in your brain before seeing any example to stick it to
| } | ||
| ``` | ||
|
|
||
| Now you can query for the application metrics you are interested in, filtering results with the label matcher `instance="juju-b2b564-0.lxd"`. |
There was a problem hiding this comment.
| Now you can query for the application metrics you are interested in, filtering results with the label matcher `instance="juju-b2b564-0.lxd"`. | |
| The instance is `juju-b2b564-0.lxd`. Now, you can query for the application metrics you're interested in, filtering results with the label matcher `instance="juju-b2b564-0.lxd"`. |
| ) | ||
| ``` | ||
|
|
||
| Let's break this down: |
There was a problem hiding this comment.
| Let's break this down: | |
| Here's what's happening: |
Personal preference here but I suggested changed just because "break down" is more complicated (it's called a "phrasal verb", which is more than one word to = one verb meaning)
This is a doc side-car for: