Skip to content

Implement sourceos-agent local runtime CLI and doctor commands #20

@mdheller

Description

@mdheller

Context

The Mac node-commander incident exposed a platform gap: SourceOS needs a reusable local-agent control surface for Nix-governed services, launchd/systemd persistence, Podman runtime checks, auth isolation, observability, and quarantine.

Canonical spec: SourceOS-Linux/sourceos-spec specs/local-agent-runtime.md.

Problem

We had a service land as a root-owned system-wide LaunchAgent, invoking a Nix store wrapper that called Podman, depending on noninteractive registry auth, writing logs to /tmp, and using unbounded launch behavior. The failure path created thousands of runs and looked like hostile persistence.

Deliverables

Implement a sourceos-agent CLI with:

  • sourceos-agent preflight <name>
  • sourceos-agent doctor <name>
  • sourceos-agent status <name>
  • sourceos-agent logs <name>
  • sourceos-agent install <name>
  • sourceos-agent stage <name>
  • sourceos-agent start <name>
  • sourceos-agent stop <name>
  • sourceos-agent restart <name>
  • sourceos-agent quarantine <name>
  • sourceos-agent uninstall <name>

Also add sourceos doctor local-runtime.

Required checks

The CLI must report:

  • launchd/systemd backend state
  • disabled override or stale service labels
  • plist/unit lint and permissions
  • Nix generation/source when available
  • Podman binary/version
  • Podman machine existence/running/socket state
  • local image presence
  • container state including Stopping
  • auth mode and host credential-helper risk
  • log paths and last exit reason
  • suspicious run/restart counts

Acceptance criteria

  • A stopped/refusing Podman machine produces one clear preflight failure and does not install active persistence.
  • A host Docker config with Google credential helpers does not poison local runtime when empty-authfile is declared.
  • A local image can run with --authfile sourceos-empty-auth.json.
  • KeepAlive=true, /tmp logs, direct registry runtime images, and raw podman run persistence are flagged.
  • sourceos-agent quarantine node-commander captures service definition, logs, podman state, image metadata, and redacted auth config.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions