Commit f7f7840
fix(otel): make service.instance.id unique per process (#4891)
All app replicas shared a hardcoded service.instance.id ("mothership-sim"),
so OTel metrics from every process collapsed into one Prometheus series.
Their independent cumulative counters then interleaved, producing phantom
counter resets that corrupt rate()/increase() — staging hosted-key cost
inflated to ~$0.72 from a few cents, while no-`key` metrics (cost_charged,
throttled, queue_wait_*) were affected fleet-wide.
Append the hostname (the container id under ECS, unique per task) so each
replica gets its own series and sum(rate(...)) / sum(increase(...)) aggregate
correctly. The mothership-sim prefix is kept so Jaeger's clock-skew adjuster
still separates Sim from Go.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>1 parent ce7ddd1 commit f7f7840
1 file changed
Lines changed: 7 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
259 | 260 | | |
260 | 261 | | |
261 | 262 | | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
266 | 269 | | |
267 | 270 | | |
268 | 271 | | |
| |||
0 commit comments