Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
89 changes: 89 additions & 0 deletions .tasks/backlog/2026-02-17-coverage-100-percent.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
title: Achieve 100% test coverage on non-generated packages
status: backlog
created: 2026-02-17
updated: 2026-02-17
---

## Objective

Two non-generated packages are below 100% statement coverage. Add tests for
the uncovered branches to reach full coverage.

## Packages and Gaps

### 1. `internal/api/system` — 98.7% (target: 100%)

**Function:** `GetSystemHostname` in `system_hostname_get.go` — 94.1%

**Uncovered branch (lines 62-65):**

```go
displayHostname := result
if displayHostname == "" {
displayHostname = workerHostname
}
```

When `QuerySystemHostname` returns an empty string for the display hostname,
the handler falls back to `workerHostname`. No test currently exercises this
path.

**Test to add** in `system_hostname_get_public_test.go`:

- Mock `QuerySystemHostname` to return `("", "worker1", nil)` — empty result
string with a valid worker hostname.
- Assert response contains `{"results":[{"hostname":"worker1"}]}`.

### 2. `internal/job/client` — 99.6% (target: 100%)

**Function:** `publishAndCollect` in `client.go` — 95.7%

**Uncovered branch A (lines 261-266):** Unmarshal error in broadcast response.

```go
if err := json.Unmarshal(entry.Value(), &response); err != nil {
c.logger.Warn("failed to unmarshal broadcast response", ...)
continue
}
```

When a KV watcher entry contains invalid JSON, the response is skipped with a
warning log. No test exercises this.

**Test to add:** Write invalid JSON to the response KV key for a broadcast
job, then write a valid response. Assert only the valid response is collected
and the invalid one is silently skipped.

**Uncovered branch B (lines 270-272):** Empty hostname fallback.

```go
hostname := response.Hostname
if hostname == "" {
hostname = "unknown"
}
```

When a worker response has an empty `Hostname` field, it is keyed as
`"unknown"` in the results map. No test exercises this.

**Test to add:** Write a valid response to KV with an empty `Hostname` field.
Assert the collected map has a key `"unknown"`.

## Verification

```bash
go test -coverprofile=/tmp/cov.out ./internal/api/system/
go tool cover -func=/tmp/cov.out | grep -v 100.0%
# Should show only total line

go test -coverprofile=/tmp/cov.out ./internal/job/client/
go tool cover -func=/tmp/cov.out | grep -v 100.0%
# Should show only total line
```

## Notes

- Generated packages (`gen/`, `mocks/`) and `cmd/` are excluded from this
goal — only hand-written business logic packages are targeted.
- The `internal/api/network` package is already at 100%.
30 changes: 30 additions & 0 deletions .tasks/backlog/2026-02-18-cli-docs-audit.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Audit CLI commands and docs for consistency
status: backlog
created: 2026-02-18
updated: 2026-02-18
---

## Objective

Audit all CLI commands and ensure documentation is up to date and consistent
across every command. After the label-based routing feature, several docs were
updated but there may be inconsistencies remaining.

## Scope

- Review every `cmd/client_*.go` file for flag descriptions, defaults, and help
text
- Cross-reference each CLI command against its corresponding doc in
`docs/docs/sidebar/usage/cli/client/`
- Verify flag tables, example output, and targeting examples are consistent
- Ensure `--target` flag description and examples use the new hierarchical label
syntax (`group:web.dev`) everywhere
- Check that OpenAPI spec parameter descriptions match CLI help text
- Verify the architecture docs accurately describe all supported target types

## Notes

- The label-based routing feature changed target syntax and added hierarchical
labels — make sure all references use the new format
- Check for any stale references to old flat label syntax (e.g., `role:web`)
231 changes: 231 additions & 0 deletions .tasks/done/2026-02-17-label-based-worker-routing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
---
title: Label-based worker routing with NATS subject wildcards
status: backlog
created: 2026-02-17
updated: 2026-02-17
---

## Objective

Extend worker targeting beyond `_any`, `_all`, and exact hostname. Admins need
to target groups of servers (e.g., all web servers, all prod machines, all
servers in rack-3). The solution should leverage NATS native subject wildcards
for zero-overhead routing.

## Problem

Today the `--target` flag accepts only:

- `_any` — load-balanced to one random worker
- `_all` — broadcast to every worker
- `server1` — direct to a specific hostname

There is no way to say "all web servers" or "every prod machine in us-east-1."

## Proposed Architecture: Labels as Subject Segments

### Core idea

Workers publish their identity through the subjects they subscribe to. Instead
of a flat `jobs.{type}.{hostname}`, add label segments that NATS wildcards can
match against.

### Subject format

```
jobs.{type}.host.{hostname} — direct to specific host
jobs.{type}.label.{key}.{value} — all hosts with that label
jobs.{type}._any — any worker (load-balanced)
jobs.{type}._all — broadcast all workers
```

Examples:

```
jobs.query.host.web-prod-01
jobs.query.label.role.web
jobs.query.label.env.prod
jobs.query.label.rack.us-east-1a
jobs.modify.label.role.db
```

### Worker config

```yaml
job:
worker:
hostname: web-prod-01 # optional, auto-detected if empty
labels: # NEW
role: web
env: prod
rack: us-east-1a
```

### Worker subscription behavior

On startup, a worker with the above config subscribes to:

```
# Existing patterns
jobs.*.host.web-prod-01 — direct messages
jobs.*._any — load-balanced (queue group)
jobs.*._all — broadcasts

# NEW: one subscription per label
jobs.*.label.role.web — all "role=web" jobs
jobs.*.label.env.prod — all "env=prod" jobs
jobs.*.label.rack.us-east-1a — all "rack=us-east-1a" jobs
```

Label subscriptions use **no queue group** so every matching worker gets the
message (broadcast semantics within the label group). If the admin wants
load-balanced label routing (send to one web server, not all), they could use
a queue group per label — but that's a future enhancement.

### Client targeting

The `--target` flag (and `target_hostname` query param) expands:

| Target value | Resolves to subject | Semantics |
| ----------------------- | --------------------------------- | -------------------|
| `_any` | `jobs.{type}._any` | Load-balanced |
| `_all` | `jobs.{type}._all` | Broadcast all |
| `web-prod-01` | `jobs.{type}.host.web-prod-01` | Direct to host |
| `role:web` | `jobs.{type}.label.role.web` | Broadcast to label |
| `env:prod` | `jobs.{type}.label.env.prod` | Broadcast to label |

The `key:value` syntax is unambiguous — hostnames cannot contain `:`.

### Why labels over hierarchical hostnames

Hierarchical hostnames (`prod.web.server1`) force a single taxonomy. A server
can only be in one hierarchy. Labels are multi-dimensional — a server can be
`role:web` AND `env:prod` AND `rack:us-east-1a` simultaneously. The admin
can target any dimension without restructuring naming conventions.

### Why not a registration/discovery service

NATS subject routing IS the discovery mechanism. Workers self-register by
subscribing to their label subjects. No external registry, no heartbeats, no
consistency problem. If a worker is running, its subscriptions are active. If
it dies, NATS removes the subscriptions. This is the simplest architecture
that could possibly work.

## Implementation Plan

### 1. Config changes

**File:** `internal/config/types.go`

```go
type JobWorker struct {
// ... existing fields ...
Labels map[string]string `mapstructure:"labels"` // NEW
}
```

### 2. Subject routing

**File:** `internal/job/subjects.go`

Add:

```go
func BuildLabelSubject(key, value string) string {
return fmt.Sprintf("jobs.*.label.%s.%s", key, value)
}

func BuildHostSubject(hostname string) string {
return fmt.Sprintf("jobs.*.host.%s", hostname)
}

func ParseTarget(target string) (subjectType, key, value string) {
// "_any" → ("_any", "", "")
// "_all" → ("_all", "", "")
// "role:web" → ("label", "role", "web")
// "server1" → ("host", "server1", "")
}
```

**Validation:** Label keys and values must be `[a-zA-Z0-9_-]+` (NATS subject
token safe). Reject dots, spaces, wildcards.

### 3. Worker subscriptions

**File:** `internal/job/worker/consumer.go`

Extend consumer creation to loop over `w.appConfig.Job.Worker.Labels` and
create a consumer + goroutine for each `label.{key}.{value}` pattern.

### 4. Stream subjects

**File:** Config / stream setup

Update JetStream stream subjects to include the new patterns:

```
jobs.query.>
jobs.modify.>
```

The `>` wildcard already covers any depth, so this should already work if the
stream is configured with `jobs.>` or similar. Verify the current
`StreamSubjects` config value.

### 5. Client-side targeting

**File:** `internal/job/client/query.go`, `modify.go`

Update `BuildQuerySubject` / `BuildModifySubject` calls to parse the target
and build the correct subject. The `publishAndCollect` method already handles
multi-response collection, so label targeting works like `_all`.

**File:** `cmd/` CLI files

Update `--target` flag help text and validation.

### 6. API changes

**File:** OpenAPI specs

Update `target_hostname` parameter description and validation to accept
`key:value` label syntax. Consider renaming to `target` in a future version.

### 7. Worker discovery

**File:** `internal/job/client/query.go` — `ListWorkers`

Extend worker discovery to optionally filter by label. When listing workers,
each worker's response should include its labels so the admin can see the
topology.

## Migration

- **Backwards compatible:** Existing configs with no `labels` key work
unchanged. The bare hostname targeting becomes `host.{hostname}` internally
but the `--target server1` syntax stays the same.
- **Stream subjects:** If currently `jobs.query.*` (single token wildcard),
must widen to `jobs.query.>` (multi-token). This is a one-time migration on
stream recreation.

## Future Enhancements (out of scope)

- **Multi-label targeting:** `role:web,env:prod` (AND semantics — requires
client-side intersection of results)
- **Load-balanced label routing:** `role:web:any` to pick one web server
(queue group per label)
- **Label wildcards:** `env:*` to target all environments
- **Admin CLI for label management:** `osapi admin workers list --label
role:web`
- **Dynamic label registration:** Workers can update labels at runtime via
NATS

## Notes

- NATS subject tokens cannot contain `.` so label values like `us-east-1` are
fine but `us.east.1` would break subject parsing. Use hyphens or underscores.
- Label-based subscriptions scale linearly with the number of unique labels
per worker. A worker with 5 labels creates 5 additional consumers. This is
well within NATS limits.
- Consider documenting recommended label taxonomies: `role`, `env`, `region`,
`rack`, `team`.
2 changes: 1 addition & 1 deletion cmd/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ func init() {
clientCmd.PersistentFlags().
StringP("url", "", "http://0.0.0.0:8080", "URL the client will connect to")
clientCmd.PersistentFlags().
StringP("target", "T", "_any", "Target hostname (_any, _all, or specific hostname)")
StringP("target", "T", "_any", "Target: _any, _all, hostname, or label (group:web.dev)")

_ = viper.BindPFlag("api.client.url", clientCmd.PersistentFlags().Lookup("url"))
}
Loading
Loading