Skip to content

[pull] main from Dicklesworthstone:main#154

Open
pull[bot] wants to merge 766 commits intojoyshmitz:mainfrom
Dicklesworthstone:main
Open

[pull] main from Dicklesworthstone:main#154
pull[bot] wants to merge 766 commits intojoyshmitz:mainfrom
Dicklesworthstone:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Mar 14, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@pull pull Bot locked and limited conversation to collaborators Mar 14, 2026
@pull pull Bot added ⤵️ pull merge-conflict Resolve conflicts manually labels Mar 14, 2026
Dicklesworthstone and others added 27 commits April 27, 2026 17:18
…ternal anchor links

Three large methodology documents at docs/ root were the canonical
write-ups of the Flywheel approach:

- COMPLETE_X_POSTS_ABOUT_PLANNING_AND_BEADS.md
- THE_FLYWHEEL_APPROACH_TO_PLANNING_AND_BEADS_CREATION.md
- THE_FLYWHEEL_CORE_LOOP.md

They share a methodology theme (multi-agent planning + beads-based
task graphs) and they cross-link each other. Group them under the
new docs/methodology/ subdirectory so docs/ root stays focused on
operational/reference docs and the methodology cluster is
self-contained for readers who want to drill into it.

Internal-link fixes in the same commit so the moves don't break
navigation:

- THE_FLYWHEEL_APPROACH_TO_PLANNING_AND_BEADS_CREATION.md: the link
  to THE_FLYWHEEL_CORE_LOOP.md was using absolute path
  `/data/projects/THE_FLYWHEEL_CORE_LOOP.md` (which would have only
  worked on the original author's machine and never resolved on
  GitHub or any clone). Rewritten to relative `./THE_FLYWHEEL_CORE_LOOP.md`
  which now resolves correctly to the sibling under docs/methodology/.
- THE_FLYWHEEL_CORE_LOOP.md§64-67: four "Tool Boundary" / "Five Terms
  You Need" / "The Core Idea" / "Common Failure Modes" anchor links
  were also using absolute `/data/projects/...` URLs. Rewritten to
  bare `#anchor` fragments, which is the correct way to link to a
  same-document section regardless of file location.

Co-Authored-By: Claude <noreply@anthropic.com>
Replace the `ACFS_REF=<ref>` environment-variable form in
build_fix_suggestion with a self-describing installer command:

  curl -fsSL https://raw.githubusercontent.com/.../REF/install.sh \
    | bash -s -- ... --ref REF

Why: the ACFS_REF env-var form forced the user to copy a `KEY=VAL
bash -s --` incantation that hides the pinned ref behind an env
prefix. The new form makes the ref visible in the URL and as an
explicit installer flag, making it easier to read, audit, and reuse
the suggested command.

Hardening in `_acfs_doctor_normalize_ref`:
- Cap ref length at 120 chars
- Reject leading `-` (would be parsed as a flag by the installer)
- Reject `@`, `.`, `..`, leading `.`, trailing `.`
- Reject `//`, `/.`, and any `*.lock` (matches git refname rules)

E2E + bats test fixtures updated to assert the new `--ref` form, the
absence of `ACFS_REF=`, and that bad refs (e.g. `-bad-ref`) fall back
to the unpinned install URL with no `--ref` argument.

Regenerated scripts/generated/internal_checksums.sh for the new
doctor.sh content.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…only

Follow-up to the doctor.sh `--ref` refactor: the smoke test, agents
helper, and Agent Mail warn path were still recommending the legacy
`--yes --only-phase N` selector for re-installs. The phase-number
indirection has long been a source of churn — when phases shuffle,
every "Fix:" line in error messages goes stale and users get told
to re-run the wrong subset.

install.sh
- New helper `acfs_smoke_install_fix_command <module-ids...>`
  builds a self-describing installer command using the same logic
  as the doctor.sh build_fix_suggestion: it walks
  ACFS_COMMIT_SHA_FULL → ACFS_REF_INPUT to pick the install URL
  (raw.githubusercontent at the pinned ref, or
  https://agent-flywheel.com/install for unpinned/main), appends
  `--yes --force-reinstall --only <module-id>` for each requested
  module, and adds `--ref <ref>` when a pin is active. Output is
  shell-quoted so the suggestion can be copy-pasted as-is.
- Five smoke-test fix lines updated to call the new helper with
  stable module-id selectors instead of phase numbers:
  * Languages   → lang.bun lang.uv lang.rust lang.go
                  (was --only-phase 5)
  * Agents      → agents.claude agents.codex agents.gemini
                  (was --only-phase 6)
  * NTM         → stack.ntm                (was --only-phase 8)
  * Onboard     → acfs.onboard             (was --only-phase 9)
  * Agent Mail  → stack.mcp_agent_mail     (was --only-phase 8)

scripts/lib/agents.sh
- _agent_check_bun's "Install bun first" warn updated from
  `--only-phase 5` to `--yes --force-reinstall --only lang.bun`
  so the bun-specific suggestion does not pull all of phase 5.

scripts/lib/smoke_test.sh
- _check_ntm warn updated from `--only-phase 8` to
  `--yes --force-reinstall --only stack.ntm` for the same reason.

scripts/generated/internal_checksums.sh
- Regenerated for scripts/lib/agents.sh.

tests/unit/lib/test_update.bats
- New test "installer recovery suggestions use stable module
  selectors" greps install.sh / agents.sh / smoke_test.sh for any
  remaining `--only-phase` selectors in user-facing fix lines and
  fails if any are found, then asserts the expected
  `acfs_smoke_install_fix_command` invocations and direct
  `--force-reinstall --only` lines are present.
- New test "install.sh: smoke fix command preserves pinned ref"
  evals the helper out of install.sh and exercises three states:
  (1) ACFS_COMMIT_SHA_FULL set → URL pinned to that SHA, --ref
      includes the SHA;
  (2) only ACFS_REF_INPUT set to a non-main ref → URL pinned to
      that ref, --ref includes that ref;
  (3) ACFS_REF_INPUT="main" with no SHA → falls back to the
      unpinned install URL with no --ref argument.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The base.filesystem.3 verify check in acfs.manifest.yaml previously
inlined its own getent / awk / passwd-parse logic to resolve
TARGET_HOME from inside the doctor's bash -c subprocess. That
duplicated three pieces of logic that already exist as installer
helpers, and the inline awk parser silently swallowed any
non-absolute or "/"-only home value (subtly different from the
installer's stricter validation).

Switch the verify block to call three generated helpers that the
doctor now injects into the subprocess heredoc:

  acfs_generated_getent_passwd_entry "$user"
    Looks up a passwd entry via $(acfs_generated_system_binary_path
    getent), with a sandbox-safe /etc/passwd fallback when getent
    is not on PATH (common in minimal install environments).

  acfs_generated_passwd_home_from_entry "$entry"
    Splits a passwd line on `:` and returns field 6 only when it
    is absolute and not "/". Strips a trailing slash. Replaces the
    inline awk pipe.

  acfs_generated_resolve_current_user
    Returns the current user via id -un, falling back to whoami.
    Used to safely test whether $HOME belongs to TARGET_USER before
    inheriting it (the previous logic compared $USER and $HOME
    directly, which got the wrong answer when sudo had carried a
    stale env block from the parent shell — see acfs#268 follow-up).

scripts/generated/install_filesystem.sh
- Inline the three helper definitions plus a shared
  acfs_generated_system_binary_path() into the install_filesystem
  child shell heredoc so the verify block can call them in any
  bash -c subprocess that doctor spawns. Mirrors the install-side
  helper bundle; no separate sourcing needed.

scripts/generated/doctor_checks.sh
- Regenerate the embedded MANIFEST_CHECKS line for base.filesystem.3
  with the new helper-based check body.

scripts/generated/manifest_index.sh
- Regenerate ACFS_MANIFEST_SHA256 to match the new
  acfs.manifest.yaml content, satisfying the manifest-drift check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runtime wrapped_cmd builders in scripts/lib/cli_tools.sh,
agents.sh, languages.sh, and cloud_db.sh were updated to prepend
`cd "$HOME" || exit 1` to the inner command so target-user
subprocesses always start in the resolved TARGET_HOME instead of
inheriting whatever cwd the parent shell happened to be in. The
test assertions were still pinned to the pre-`cd` shape and now
fail to match.

Update the four `grep -F` lines in
"runtime helpers normalize wrapped commands" to include
`cd \"\$HOME\" || exit 1; ` between `set -o pipefail; ` and
`$cmd`, matching what the source files actually emit.

Pure test-fixture sync; no behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously, asking for `--only mod --skip mod` (or the equivalent
`--skip-vault --only tools.vault` form) silently produced a
zero-module install plan. That is confusing: the user explicitly
asked for the module, so the safer behavior is to fail loudly and
tell them which selector excluded it, rather than ship an empty plan
that looks "successful" in --print-plan mode.

Behavior change in scripts/lib/install_helpers.sh:
- After the dependency closure runs, walk ONLY_MODULES and check it
  against the skip_set. If any directly requested module is in the
  skip set, log the originating skip reason (e.g. "skip flag
  --skip-vault") and return 1.
- The check fires for both `--only mod --skip mod` and combined
  --only / --skip-* shorthand selectors, even with --no-deps. Direct
  selector contradictions should never be masked.
- --only-phase remains permissive: skipping an individual module
  inside a selected phase is still allowed (this is how users opt
  out of one optional component within a phase).

Tests:
- scripts/lib/test_install_helpers.sh: three new pure-bash tests
  covering the contradiction error, the --no-deps variant, and the
  --only-phase escape hatch.
- tests/unit/lib/test_install_helpers.bats: matching bats tests so
  the contradiction check is exercised by the same harness CI runs.
- tests/vm/selection_e2e.sh: tighten the two existing combo tests
  so they now require a non-zero exit *and* a message mentioning
  selection/only/skip, instead of accepting any empty plan.
- scripts/generated/internal_checksums.sh: refresh checksum for
  scripts/lib/install_helpers.sh after the body change.

Test infrastructure:
- tests/unit/test_helper.bash: add a tiny `fail()` shim when
  bats-assert/bats-support are unavailable, so pure-bats tests can
  call `fail "msg"` without sourcing the full helper bundle.
- tests/unit/lib/test_install_helpers.bats: introduce a
  `use_spy_sudo` helper that swaps in a spy `sudo` and overrides
  `_acfs_system_binary_path`, so future tests around privilege
  elevation can assert on argv without touching the real /usr/bin
  path. Also tighten the "Did not call bash -c" assertion to require
  an absolute path prefix (`/bash -c`) so the test will fail if a
  caller ever falls back to a bare `bash` lookup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`optional_feature` is the helper used throughout the new-project
bootstrap path to attempt advisory side effects (initializing
beads, registering with cass, seeding skill inventories) without
having those failures abort the bootstrap. The helper's previous
shape was:

    optional_feature() {
        local feature_name="$1"
        shift
        local command="$*"        # ← all remaining args joined by IFS
        ...
        if eval "$command" 2>/dev/null; then ...
    }

Two failure modes:

1. **Shell metacharacters in arguments are interpreted by eval.**
   Any caller that interpolates a path or label into the command
   args would have semicolons, backticks, `$()`, redirections, etc.
   inside those args expanded by the second pass through `eval`.
   For a system that's running on operator-controlled but
   sometimes-AI-provided inputs (project names, repo paths,
   skill IDs), that's a command-injection sink waiting to fire.
   A test fixture was already in the file for the assertion that
   `printf '%s' "hello; touch $marker"` MUST NOT create the marker
   — but the eval-based shape would happily run the chained
   `touch`.

2. **Empty command after `shift` ran into `eval ""`.** If a caller
   accidentally passed only the feature name with no command,
   `command="$*"` was the empty string and `eval ""` is a silent
   success — so the helper would log "Optional feature succeeded:
   empty-command" and the bootstrap would mark a no-op as done.

Refactor:

    optional_feature() {
        local feature_name="$1"
        shift
        if [[ $# -eq 0 ]]; then
            log_warn "Optional feature skipped: $feature_name"
            return 0
        fi
        if "$@" 2>/dev/null; then ...
    }

  - `"$@"` invokes the remaining args as a literal argv, with no
    shell parse pass. Metacharacters in args become part of the
    argument string, not new shell tokens. A semicolon in an arg
    is a literal semicolon — exactly what `printf '%s'` was being
    asked to print in the regression test.
  - The empty-command branch logs "skipped" and returns 0, which
    matches the helper's "optional, never escalate" contract but
    no longer falsely reports success.

Tests:

  - `"optional_feature does not evaluate shell metacharacters"`:
    runs `optional_feature "..." printf '%s' "hello; touch $marker"`
    and asserts both that the call succeeds AND that `$marker`
    was NOT created. With the previous eval shape, `eval` would
    have parsed the joined `printf '%s' hello; touch <path>`
    string, run `printf` and then `touch`, and the marker would
    exist. Under the new shape, `printf` receives the literal
    string `"hello; touch <path>"` as its sole positional arg
    and prints it verbatim. The marker stays absent.
  - `"optional_feature treats missing command as skipped"`: passes
    only the feature name, asserts the call succeeds AND the
    output does NOT contain "Optional feature succeeded" — the
    new "skipped" log path runs instead.

The helper's external behavior for genuine usage is unchanged:
callers that pass real argv lists (`optional_feature "init beads"
br init` etc.) still produce the same return code and logging
shape they did before. Only the eval surface goes away.
Dicklesworthstone and others added 30 commits May 6, 2026 17:51
…E + tests/README

`tests/vm/test_factory_install_qemu.sh` (new) — local-VM equivalent of
the disposable-VPS release gate. Where the existing
`test_factory_install_ubuntu.sh` runs against a real ssh-able host
(Docker container, VPS, or a freshly-provisioned bare-metal box),
this wrapper boots an official Ubuntu cloud image inside QEMU/KVM,
waits for root SSH on a forwarded port, then hands off to the same
factory-host harness. Real kernel, systemd PID 1, real sshd, real
cloud-init, real user services, real reboot-capable semantics — all
without needing a fresh VPS for every iteration.

Configurable via environment variables:

- `ACFS_QEMU_UBUNTU_VERSION` (default 25.10)
- `ACFS_QEMU_MODE`            (default vibe)
- `ACFS_REF`                  (default main)
- `ACFS_QEMU_EXPECT_FINAL_UBUNTU_VERSION`
- `ACFS_QEMU_INSTALL_TIMEOUT_SECONDS` (default 14400 / 4h)

Image download is verified against Canonical's published SHA256SUMS
+ GPG signature before boot (so a hijacked mirror can't substitute
a backdoored image). Uses cloud-init's NoCloud datasource via a
local seed iso to provision the SSH key and basic user account.
KVM is required; the script bails out with a clear message if
`/dev/kvm` is missing or unreadable, rather than silently falling
back to slow software emulation.

README.md updates the existing factory-install section with a
"Local QEMU Factory E2E" subsection that documents the new wrapper,
the apt prerequisites, and the recommended invocation. tests/README.md
gets the same pointer in the per-test inventory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fresh QEMU testing exposed a regression that Docker had not caught: the Rust MCP Agent Mail installer could successfully install the binaries but fail during its own MCP config or user-service readiness path on a factory Ubuntu image. That made ACFS fail phase 8 before the ACFS-managed service setup had a chance to normalize the endpoint.\n\nPass AM_INSTALL_SKIP_MCP_SETUP=1 through every ACFS path that invokes the verified MCP Agent Mail installer: the manifest-generated installer, the stack library, the self-update path, and the legacy install.sh fallback. ACFS still verifies the upstream installer checksum and installs the same binaries, but ACFS now remains the single owner of service configuration and readiness checks.\n\nAdd unit coverage for both update_stack and install_mcp_agent_mail so future changes prove that the upstream setup is skipped and ACFS waits on its own managed service. Regenerate manifest artifacts and internal checksums after the manifest/update changes.\n\nVerification:\n- bash -n install.sh scripts/lib/stack.sh scripts/lib/update.sh scripts/generated/install_stack.sh tests/unit/lib/test_update.bats\n- shellcheck install.sh scripts/lib/stack.sh scripts/lib/update.sh scripts/generated/install_stack.sh tests/unit/lib/test_update.bats\n- shellcheck install.sh scripts/**/*.sh tests/unit/lib/test_update.bats\n- cd packages/manifest && bun run generate:diff\n- ./scripts/check-manifest-drift.sh\n- bash scripts/lib/test_security.sh\n- bats tests/unit/lib/test_update.bats --filter 'MCP Agent Mail'\n- bats tests/unit/lib/test_update.bats --filter 'stack MCP Agent Mail installer'\n- ./tests/vm/test_install_ubuntu.sh --ubuntu 25.10 --mode vibe\n- ubs acfs.manifest.yaml install.sh scripts/generated/install_stack.sh scripts/generated/internal_checksums.sh scripts/generated/manifest_index.sh scripts/lib/stack.sh scripts/lib/update.sh tests/unit/lib/test_update.bats
The QEMU factory canary completed the installer but failed its post-install command surface check because acfs was globally available while onboard and acfs-update depended on shell-profile PATH setup. That is too fragile for a fresh VPS: interactive zsh logins usually work, but non-zsh login shells, sudo probes, remote command checks, and support debugging should all find the core ACFS entrypoints without relying on profile sourcing.\n\nInstall global /usr/local/bin links for onboard and acfs-update during finalize, mirroring the existing global acfs command behavior. The links point at the target ACFS install tree and refuse to replace existing non-symlink commands. The generated ACFS installer path now does the same when passwordless sudo is available, and acfs-update self-sync repairs the global links after deployed files are refreshed.\n\nAdd unit coverage for update-side global command link repair and regenerate manifest artifacts/internal checksums.\n\nVerification:\n- bash -n install.sh scripts/lib/update.sh scripts/generated/install_acfs.sh tests/unit/lib/test_update.bats\n- shellcheck install.sh scripts/lib/update.sh scripts/generated/install_acfs.sh tests/unit/lib/test_update.bats\n- shellcheck install.sh scripts/**/*.sh tests/unit/lib/test_update.bats\n- cd packages/manifest && bun run generate:diff\n- ./scripts/check-manifest-drift.sh\n- bash scripts/lib/test_security.sh\n- bats tests/unit/lib/test_update.bats --filter 'global core command link sync'\n- git diff --check\n- ubs acfs.manifest.yaml install.sh scripts/generated/install_acfs.sh scripts/generated/internal_checksums.sh scripts/generated/manifest_index.sh scripts/lib/update.sh tests/unit/lib/test_update.bats
The QEMU factory canary caught the real MCP Agent Mail regression, then exposed a verifier bug of its own: run_target_step used sudo -i with bash -lc. Sudo's login-shell path can reconstruct the command through the target user's login shell before bash receives it, so shell variables inside verifier snippets were expanded too early. The symptom was a false post.path_core failure even after the installer had created the expected commands.\n\nRun target-user verifier snippets with sudo -u plus explicit HOME and PATH instead of sudo -i. This keeps the test realistic for the installed command surface while avoiding a quoting/reparse artifact that a real user would not encounter as written. The separate zsh startup check still exercises the interactive shell path.\n\nVerification:\n- bash -n tests/vm/test_factory_install_ubuntu.sh\n- shellcheck tests/vm/test_factory_install_ubuntu.sh\n- git diff --check\n- ubs tests/vm/test_factory_install_ubuntu.sh
The QEMU factory canary exposed that the production one-liner finalize path installed the ACFS nightly update service and timer templates under ~/.acfs/scripts/templates, but never copied them into the target user's systemd unit directory or enabled the timer. The generated manifest path already had this behavior, so installs that still depend on the legacy finalize deployment surface could pass Docker while leaving a fresh VPS without the nightly timer enabled.

Add configure_acfs_nightly_timer to install the service/timer into ~/.config/systemd/user, reload user systemd, enable and start acfs-nightly-update.timer, and verify it is enabled. The helper degrades to a warning only when user systemd is unavailable, preserving container canaries while making real systemd hosts fail on actual timer setup errors.

Add a Bats regression assertion so legacy finalize cannot regress back to template-only deployment.
Default acfs doctor now treats MCP Agent Mail as an install-health pass once the target CLI, HTTP liveness endpoint, and user systemd service are healthy. The deeper am doctor mailbox/archive probe remains available under --deep, but normal installer verification no longer reports false warnings for unrelated mailbox hygiene.

The factory-host harness now runs acfs doctor --json and requires zero failures and zero warnings, so QEMU and real-host canaries stop accepting noisy verifier output. The regression test proves that an unhealthy mailbox doctor result is ignored in the default install-health path after the service proof succeeds.

The factory workflow now supports a scheduled QEMU/KVM backend, a manual backend selector, and repository_dispatch handoff from an external disposable-VPS provisioner. Documentation now states Docker is only a smoke layer, while QEMU/real-host factory E2E is the authoritative first-run VPS proof.
Detected by check-manifest-drift.sh.
Regenerated installer and web generated artifacts via `bun run generate`
to sync ACFS_MANIFEST_SHA256 and internal checksums with source files.
Fresh-eyes review found that the real-host factory workflow still decided whether to run using only the fallback ACFS_FACTORY_SSH_TARGET secret. That contradicted the new repository_dispatch design, where an external provisioner should pass a one-time fresh host through client_payload.ssh_target and avoid storing stale long-lived host addresses in repository secrets.

The resolver now computes an effective SSH target from client_payload.ssh_target first and then ACFS_FACTORY_SSH_TARGET. It only skips real-host runs when that effective target is empty or the required private key is missing. A focused Bats regression guards the dispatch payload path so this cannot silently regress.

Verification: bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; shellcheck tests/unit/lib/test_update.bats; go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; git diff --check; ./scripts/check-manifest-drift.sh
A fresh-eyes review found that the factory workflow advertised scheduled QEMU/KVM canaries while running on the plain ubuntu-latest label. GitHub-hosted nested virtualization is not a contractual standard-runner capability, so that setup could fail before the installer ran and make a runner limitation look like an ACFS regression.

The workflow now lets callers choose the runner through the manual input, repository_dispatch payload, or ACFS_FACTORY_RUNNER repository variable, adds a KVM permission setup step, and fails at an explicit /dev/kvm preflight before invoking the installer. README and test docs now explain that GitHub QEMU requires a KVM-capable larger or self-hosted runner, while real-host remains the portable provider sentinel.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A third fresh-eyes pass found that the reusable factory workflow still marked real-host SSH secrets as required even after QEMU became a first-class backend. That made workflow_call + backend=qemu impossible without passing irrelevant real-host credentials, which would turn the reusable QEMU path into configuration friction and contradict the documented KVM runner mode.

The workflow_call secrets are now optional, with the existing runtime resolver still enforcing that real-host runs need an effective SSH target and private key before executing. The docs now distinguish QEMU reusable calls from real-host secret requirements, and the focused Bats guard asserts the optional-secret contract alongside the dispatch ssh_target and KVM preflight wiring.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A fourth fresh-eyes pass found that the runner fallback chain was self-defeating: workflow_dispatch and workflow_call set runner to ubuntu-latest by default, so vars.ACFS_FACTORY_RUNNER could not select the KVM-capable runner even though the docs advertised that path. That made the recommended repository-variable setup ineffective for manual and reusable QEMU canaries.

The runner input now defaults to blank for manual and reusable invocations, allowing the existing runs-on expression to fall through to client_payload.runner, ACFS_FACTORY_RUNNER, and finally ubuntu-latest. Docs now say to leave the runner input blank when the repository variable should apply, and the focused Bats guard asserts the blank default and explanatory text.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A fifth fresh-eyes pass found a false-green risk in the factory workflow. When backend=real-host was requested without an effective SSH target or private key, the resolver set enabled=false and skipped the test. That could leave the workflow green even though no disposable VPS was exercised.

The resolver now treats missing real-host credentials as a configuration error and exits before any test step. QEMU remains usable without SSH secrets. Docs now state that real-host must fail rather than report a skipped canary, and the Bats guard rejects the old 'Skipping real-host factory E2E' path.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A sixth fresh-eyes pass found a persistent-workspace failure mode in the factory workflow. The QEMU backend passed a fixed tests/artifacts/qemu-factory-e2e path, while the QEMU wrapper intentionally refuses to overwrite an existing acfs-factory.qcow2 disk. On KVM-capable self-hosted runners that reuse workspaces, the next scheduled/manual canary could fail before testing ACFS because of stale artifacts from a previous run.

The workflow now uses run-specific artifact directories derived from GITHUB_RUN_ID and GITHUB_RUN_ATTEMPT for both QEMU and real-host paths. This preserves the QEMU wrapper's stale-disk safety check while preventing repeated workflow runs from colliding with prior artifacts. Docs and the focused Bats guard now capture that contract.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A seventh fresh-eyes pass found that the workflow wrote run-specific artifact directories but still uploaded the entire tests/artifacts tree. On persistent KVM/self-hosted runners that could package stale QEMU disks, caches, and logs from previous canary attempts, obscuring the current run and wasting artifact storage.

Upload-artifact now targets only the QEMU and real-host directories for the current GITHUB_RUN_ID/GITHUB_RUN_ATTEMPT and warns rather than becoming the primary failure when an early environment check stops before artifacts are created. Docs and the focused Bats guard now assert that the upload path stays run-scoped.

Verification: go run github.com/rhysd/actionlint/cmd/actionlint@v1.7.7 .github/workflows/installer-factory-e2e.yml; ruby -e 'require "yaml"; YAML.load_file(".github/workflows/installer-factory-e2e.yml"); puts "workflow yaml ok"'; shellcheck tests/unit/lib/test_update.bats; bats --filter 'factory workflow repository_dispatch ssh_target enables disposable real-host runs|doctor.sh: default Agent Mail stack check ignores mailbox doctor failures after service proof|legacy finalize installs and enables nightly user timer' tests/unit/lib/test_update.bats; git diff --check; ./scripts/check-manifest-drift.sh
A fresh-eyes pass on the failure-reporting path found two user-visible regressions that matched the recent stack failure report. report_failure ignored persisted failed_step/failed_error when CURRENT_STEP and LAST_ERROR were empty, so phase failures could render as 'unknown step' and 'Unknown error' even though state.json had better context. It also called get_suggested_fix through an || chain, but that helper prints generic guidance before returning nonzero, so unknown failures rendered the generic troubleshooting block twice.

report_failure now hydrates missing phase, step, and error fields from ACFS_STATE_FILE when jq can read it, including the human phase name from ACFS_PHASE_NAMES. Suggested-fix selection now checks for a known pattern first and calls get_suggested_fix exactly once for the chosen input, preserving generic guidance without duplicating it.

Regression coverage exercises both cases: unknown failures render the generic fix once, and persisted stack/MCP Agent Mail checksum context replaces unknown report fields. Verification: shellcheck scripts/lib/report.sh tests/unit/lib/test_error_tracking.bats; bats --filter 'report_failure renders generic unknown fix once|report_failure fills missing context from state file' tests/unit/lib/test_error_tracking.bats; bats tests/unit/lib/test_error_tracking.bats tests/unit/lib/test_state.bats; shellcheck install.sh scripts/**/*.sh; ./scripts/check-manifest-drift.sh; ubs scripts/lib/report.sh tests/unit/lib/test_error_tracking.bats; git diff --check.
A fresh-eyes reread of the new failure-reporting tests found that they proved the terminal output but did not prove the machine-readable JSON failure entry. That left room for a future regression where the installer screen stays helpful while the log consumed by tooling falls back to vague unknown-step context.

The regression tests now assert the JSON suggested_fix is not duplicated for unknown failures and that the persisted stack/MCP Agent Mail checksum context is also written to the JSON log.

Verification: shellcheck tests/unit/lib/test_error_tracking.bats scripts/lib/report.sh; bats --filter 'report_failure renders generic unknown fix once|report_failure fills missing context from state file' tests/unit/lib/test_error_tracking.bats; bats tests/unit/lib/test_error_tracking.bats tests/unit/lib/test_state.bats; ./scripts/check-manifest-drift.sh; ubs scripts/lib/report.sh tests/unit/lib/test_error_tracking.bats; shellcheck install.sh scripts/**/*.sh; git diff --check.
A wider fresh-eyes audit found that get_suggested_fix did not satisfy its own output contract under the installer's set -euo pipefail mode. When no error pattern matched, the command substitution around get_error_pattern returned 1 before generic guidance could be used by direct callers; after tolerating that lookup, the helper still returned 1 after printing the generic text, so formatted callers could still abort before continuing.

The helper now treats no-match as a normal fallback path, while is_known_error remains the status API for callers that need match/no-match semantics. format_error_with_fix also tolerates the no-match pattern lookup explicitly. Regression tests cover both direct suggestions and formatted error output continuing after unmatched errors under errexit.

Regenerated scripts/generated/internal_checksums.sh with the manifest generator so bootstrap integrity checks match the changed error library.

Verification: bash -c 'set -euo pipefail; source scripts/lib/errors.sh; get_suggested_fix "totally unmatched failure"; printf "after\n"'; shellcheck scripts/lib/errors.sh tests/unit/lib/test_error_tracking.bats scripts/lib/report.sh; bats --filter 'errors: unmatched suggestions survive errexit|errors: formatted unmatched errors survive errexit|report_failure renders generic unknown fix once|report_failure fills missing context from state file' tests/unit/lib/test_error_tracking.bats; bats tests/unit/lib/test_error_tracking.bats tests/unit/lib/test_state.bats; shellcheck install.sh scripts/**/*.sh; ./scripts/check-manifest-drift.sh; ubs scripts/lib/errors.sh tests/unit/lib/test_error_tracking.bats scripts/generated/internal_checksums.sh; git diff --check.
A random fresh-eyes pass through the wizard and Bash utility paths found that scripts/lib/progress.sh could abort an installer shell if progress_update was called before progress_init. The library's globals default ACFS_PROGRESS_ENABLED=true while ACFS_PROGRESS_TOTAL=0, so the first update attempted current * 100 / total and failed with a division-by-zero arithmetic error under set -euo pipefail.

progress_init now treats non-positive or nonnumeric totals as disabled progress state, progress_update turns an uninitialized or invalid total into a no-op instead of performing arithmetic, and _progress_bar clamps invalid inputs and over-complete percentages. This preserves the normal initialized installer path while making the helper safe when a caller is misordered or a category has no progress total.

The unit harness now includes an errexit regression proving progress_update before initialization continues to the next command.

Verification: bash -c 'set -euo pipefail; source scripts/lib/progress.sh; progress_update before-init >/tmp/acfs-progress-probe.out 2>/tmp/acfs-progress-probe.err; printf after'; bash tests/unit/test_progress_bar.sh; shellcheck scripts/lib/progress.sh tests/unit/test_progress_bar.sh; shellcheck install.sh scripts/**/*.sh; ./scripts/check-manifest-drift.sh; ubs scripts/lib/progress.sh tests/unit/test_progress_bar.sh; git diff --check.
A wider review of recent factory canary work found that test_factory_install_qemu.sh generated root_ssh_key directly inside the run artifact directory. The GitHub workflow now uploads only current run directories, but that still meant the QEMU backend could package a live guest SSH private key in upload-artifact output. The same layout also made local ignored test artifacts an easy source of accidental credential detritus inside the repository checkout.

The QEMU wrapper now separates generated private SSH key material into ACFS_QEMU_KEY_DIR, defaulting to RUNNER_TEMP or a mktemp directory outside the repo. It refuses any --key-dir inside the repository or artifact tree before QEMU dependency checks run, keeps the artifact directory for disks/logs/factory output only, and documents the corrected boundary in README.md and tests/README.md.

Regression coverage in test_update.bats asserts that the key path no longer points under ARTIFACTS_DIR and directly exercises the early failure for an unsafe repo-local key directory.

Verification: bash -n tests/vm/test_factory_install_qemu.sh; shellcheck tests/vm/test_factory_install_qemu.sh tests/unit/lib/test_update.bats; shellcheck install.sh scripts/**/*.sh tests/vm/test_factory_install_qemu.sh tests/unit/lib/test_update.bats; bats --filter 'qemu factory keeps generated SSH private key outside uploaded artifacts|factory workflow repository_dispatch ssh_target enables disposable real-host runs' tests/unit/lib/test_update.bats; bats tests/unit/lib/test_update.bats; ./scripts/check-manifest-drift.sh; git diff --check; ubs README.md tests/README.md tests/unit/lib/test_update.bats tests/vm/test_factory_install_qemu.sh.
…nostic artifacts before local upload

Two coordinated hardening changes to keep accidentally-captured
secrets out of factory diagnostics and support bundles.

1) New `redact_file` regex for "Generated password" lines

`scripts/lib/support.sh::redact_file` already covered Bearer tokens,
JWTs, and inline URL credentials, but not the post-install message
ACFS emits when it auto-creates the `ubuntu` user:

    WARN: Generated password for 'ubuntu': <plaintext-password>

These lines end up in installer logs that get bundled into support
archives and (until this commit) shipped verbatim with the password
intact.

Add a sed expression matching `Generated password for '?<name>'?:
<value>` (4+ chars of value, terminated by whitespace or angle
bracket) and replace the value with `<REDACTED:password>`. Quoted
and unquoted username forms both match. Matches the surrounding
redaction style (single sed pass with `-E` extended regexes).

Companion test `tests/test_support_redaction.sh` adds:

    redact_and_read "generated_password_log.txt" \
        "WARN: Generated password for 'ubuntu': <random>"
    -> "Generated password for 'ubuntu': <REDACTED:password>"

asserting both that the redaction marker appears AND that the
plaintext value is gone (no false-positive on the marker alone).

2) Factory harness redacts diagnostic artifacts before local upload

`tests/vm/test_factory_install_ubuntu.sh` previously copied raw
factory logs (REMOTE_LOG, REMOTE_JSONL, INSTALL_LOG,
IDEMPOTENCY_LOG, /var/log/acfs/, /home/ubuntu/.acfs/logs/) into
the artifact bundle that GitHub Actions uploads. With the new
support.sh password-redaction in place, the same redaction has to
run on this artifact set before upload, otherwise the support
archive still leaks them.

Add `FACTORY_REDACTED_ARTIFACT_DIR=""` plus a new
`redact_factory_artifacts()` step:

* Stages the four log files plus the two log directories under
  `${ACFS_FACTORY_REMOTE_DIR}/redacted-artifacts-<epoch>/{factory,
  var-log-acfs, home-ubuntu-acfs-logs}/`.
* Sources `/home/ubuntu/.acfs/scripts/lib/support.sh` (failing the
  step with `fail "artifacts.redaction"` if it's missing) so the
  redaction logic comes from the same library that production
  support bundles use - any future regex added to `redact_file`
  is automatically picked up.
* Calls `redact_bundle "$stage"` with `REDACT=true REDACTION_COUNT=0
  VERBOSE=false` so the staging tree is rewritten in-place.
* Tarballs the redacted staging dir for upload. The `--exclude`
  argument was deliberately removed so the previous "redact then
  re-include the unredacted source" footgun can't recur (asserted
  by the new bats test below).
* Records the staging path in `FACTORY_REDACTED_ARTIFACT_DIR` and
  emits `pass "artifacts.redaction"` so the harness's structured
  log shows the redaction step ran.

`tests/unit/lib/test_update.bats` adds
`@test "factory harness redacts diagnostic artifacts before local
upload"` which greps the harness for:

* the new `redact_factory_artifacts()` definition,
* the support-lib path `/home/ubuntu/.acfs/scripts/lib/support.sh`,
* the redacted-staging tarball line,
* `redact_local_factory_artifacts()` and its `redact_bundle` call,
* absence of the dangerous `--exclude="$archive"` form that would
  re-include the unredacted source,
* presence of the `/home/ubuntu/.acfs/logs` directory in the staged
  tree.

`README.md` and `tests/README.md` both gain one-sentence prose
updates so operators reading the workflow / harness documentation
know that factory diagnostics are redacted before local upload, and
which artifact paths are covered.

Net effect: the support-bundle and factory-harness paths now share
a single source of truth for what counts as a secret, and the
post-install "Generated password" message no longer survives into
any uploaded artifact.
…ry require_command checks

Three small follow-ons to the password / factory-redaction commit
that just landed:

1) `scripts/lib/support.sh::redact_bundle` find expression now
   includes `*.jsonl` alongside `*.json` / `*.log` / `*.txt` /
   `*.yaml` / `*.yml` / `*.sh` / `.zshrc` / etc. Without `.jsonl`
   on the list, line-delimited JSON event logs (e.g. the factory
   harness's structured event stream and the SSO test runner's
   per-event output) were copied verbatim into support bundles
   even when they contained `Generated password` lines. Adding
   the suffix means the same `redact_file` regex set runs against
   them.

2) `tests/test_support_redaction.sh` extends the bundle-level
   redaction test with a `.jsonl` case:
   - Stages
     `{"message":"Generated password for 'ubuntu': <random>"}` in
     `bundle/events.jsonl`.
   - Asserts the redacted file contains `<REDACTED:password>` AND
     does NOT contain the raw password value.
   - Bumps the bundle-level `REDACTION_COUNT` floor from `>= 2`
     to `>= 3` so the new file is required to participate, not just
     pass through silently.

3) `tests/vm/test_factory_install_ubuntu.sh::main` adds four new
   `require_command` calls (`cp`, `find`, `md5sum`, `sed`) to its
   pre-flight binary check. The redaction step added in the
   companion commit shells out to those tools; failing fast at
   pre-flight gives a clear "missing cp/find/md5sum/sed" diagnostic
   instead of a confusing "command not found" mid-redaction.

Net: support bundles redact JSONL event streams, the test asserts
they do, and the factory harness self-checks for the binaries it
now depends on.
Support bundles are documented as safe to share after redaction, but the redactor only handled single-line token/password forms. A PEM/OpenSSH/PGP private-key block is multiline text, so it survived unchanged if it appeared in an install log, config snippet, or other collected diagnostic file.

Add a multiline redaction pass that collapses any BEGIN/END PRIVATE KEY block to <REDACTED:private_key> before the existing sed-based token rules run. The helper also fails closed for truncated logs by dropping everything after a private-key header through EOF. The bundle manifest now advertises private_key as a redaction pattern and README documents the behavior.

Tests now cover OpenSSH, PGP, truncated RSA private-key blocks, bundle traversal of a private key in a log file, and support-bundle archive extraction to ensure neither the raw header nor payload survives. The SUDO_USER support-bundle integration test was also isolated from the host /var/lib/acfs state by giving it an explicit mock system-state file instead of a fake getent function, which the hardened resolver correctly ignores.

Verification:

- bash -n scripts/lib/support.sh tests/test_support_redaction.sh tests/vm/test_support_bundle.sh

- shellcheck install.sh scripts/**/*.sh tests/vm/test_support_bundle.sh tests/test_support_redaction.sh tests/unit/lib/test_update.bats

- bash tests/test_support_redaction.sh

- bash tests/vm/test_support_bundle.sh

- bats tests/unit/lib/test_update.bats

- bash tests/unit/test_changelog_export_status.sh

- ./scripts/check-manifest-drift.sh

- git diff --check

- ubs README.md scripts/lib/support.sh tests/test_support_redaction.sh tests/vm/test_support_bundle.sh
…ref, and skip-flag set

Substantial rework of `report_build_resume_command` in
`scripts/lib/report.sh` and a companion 60-line bats test in
`tests/unit/lib/test_error_tracking.bats`.

Background: when the installer fails partway, the failure report
prints a "to resume, run:" command. The previous helper printed a
generic `curl -fsSL https://acfs.sh | bash -s -- --resume` line that
was wrong in three documented scenarios:

1. The user installed from a non-`main` ref (a tagged release, a
   custom fork branch, or a commit SHA pinned via
   `ACFS_COMMIT_SHA_FULL`). Resuming against `main` would race ahead
   of (or behind) the failed install's expected layout.
2. The user supplied skip flags on the original invocation
   (`--skip-postgres`, `--skip-vault`, `--skip-cloud`,
   `--skip-preflight`, `--skip-ubuntu-upgrade`). Resuming without
   them would re-attempt steps the operator deliberately declined.
3. The user was running off a local checkout (`SCRIPT_DIR` is set)
   but the printed command pulled fresh source via curl, defeating
   the local-checkout intent and silently switching the operator to
   a different code path.

The new helper:

* Delegates to `generate_resume_hint` first if that helper exists
  (lets a downstream installer override the synthesized command).
* Detects whether the local `curl` accepts `--proto '=https'` /
  `--proto-redir '=https'` and uses the strict form when supported,
  matching the rest of the installer's curl-pipe-bash discipline.
* Builds the `resume_args` array from the live environment:
  preserves `MODE` when not the default `vibe`, propagates
  `YES_MODE`, `STRICT_MODE`, and each of the five skip flags.
* Pins `--ref` to whichever of `ACFS_COMMIT_SHA_FULL`,
  `ACFS_REF_INPUT`, or `ACFS_REF` is set and != `main`. Tracks
  `resume_ref_pinned_from_commit` so the corresponding
  `--checksums-ref` is also pinned when the operator explicitly
  supplied one (via `ACFS_CHECKSUMS_REF_EXPLICIT=true`) or
  defensively when the commit pin implies it.
* When `SCRIPT_DIR` is set, the printed command becomes
  `bash <SCRIPT_DIR>/install.sh ...` (with shell-safe quoting via
  `printf -v ... %q`); the curl-pipe form is reserved for callers
  invoked via curl.
* When falling back to curl, prefers the explicit pinned-commit URL
  on `raw.githubusercontent.com` over `https://acfs.sh` so the
  resume request's content-addressed identity matches the original
  install.
* All argv assembly goes through `printf -v <var> %q "$value"` so
  arbitrary skip-flag values, refs with shell metacharacters, and
  URLs with `&` / `?` survive the resume printout intact.

`tests/unit/lib/test_error_tracking.bats` adds 60 lines: bats cases
that exercise each of the new branches (skip flags, pinned ref via
commit SHA, pinned ref via input, explicit checksums-ref, local
SCRIPT_DIR vs curl path). Each test asserts the exact resume command
string the helper now produces, so a future refactor can't silently
drift the printed line.

No behavior change for the happy path (no pinned ref, no skip flags,
no SCRIPT_DIR, vibe mode): output is unchanged.
…me hints carry --strict reliably

Three coordinated edits closing a regression in the resume-hint
synthesizer.

Background: `install.sh` parses `--strict` and exports
`ACFS_STRICT_MODE=true` (consumed by the tools.sh checksum-strictness
path). The resume-hint generator (`generate_resume_hint` in install.sh
plus `report_build_resume_command` in scripts/lib/report.sh) keys the
`--strict` flag off the `STRICT_MODE` *shell variable* on the
in-memory state — but the parser never set that variable, so the
synthesized resume command silently dropped `--strict` whenever the
operator originally invoked the installer with that flag. The
re-resumed install would then run in lenient mode and accept
whatever checksums the new manifest happened to ship.

Fix:

* `install.sh`:
  - Initialize `STRICT_MODE=false` alongside the other mode flags.
  - In `parse_args`'s `--strict` branch, set `STRICT_MODE=true` in
    addition to exporting `ACFS_STRICT_MODE=true`. Now both the
    resume hint and downstream tools see the same authoritative
    state.
  - `generate_resume_hint` accepts either form
    (`STRICT_MODE=true || ACFS_STRICT_MODE=true`) so a sub-shell
    that inherits only the env var also gets `--strict` echoed
    back into the resume command.

* `scripts/lib/report.sh::report_build_resume_command` mirrors the
  same OR check so the report-side synthesis path matches the
  install-side path.

* `tests/unit/test_resume_hint.sh` adds `test_acfs_strict_mode`
  (sets ACFS_STRICT_MODE=true on a clean state, asserts the
  generated resume command includes `--strict`) and bumps the
  setup to default both shell vars to false so the test runs
  deterministically against either source.

* `tests/unit/lib/test_error_tracking.bats` switches its existing
  fixture from `STRICT_MODE=true` to `ACFS_STRICT_MODE=true` to
  pin the env-var path that production callers actually take.

Net effect: an installer invoked with `--strict` always emits a
resume command that includes `--strict`, regardless of which of the
two state surfaces (shell var vs env var) the resume-hint helper
happens to consult.
The resume-report path now persists a precise, pinned resume command into state.json, but acfs continue still flattened failed installs down to a generic 'rerun with --resume' instruction. That created a second stale recovery surface: a user checking status after a failed install could miss the exact --ref/--checksums-ref/--yes command generated by the installer.

Change scripts/lib/continue.sh to read .resume_hint from the selected state file and print it on the failed-status path. The value is displayed only, never evaluated, and the output uses printf so stored command text is not treated as echo flags or escapes.

Extend tests/unit/test_changelog_export_status.sh with an installed-layout regression that writes a failed stack state containing a pinned raw GitHub resume command, runs acfs continue --status, and asserts the persisted command is shown instead of the generic fallback wording.

Verification:

- bash -n scripts/lib/continue.sh tests/unit/test_changelog_export_status.sh

- shellcheck install.sh scripts/**/*.sh tests/unit/test_changelog_export_status.sh

- bash tests/unit/test_changelog_export_status.sh (287/287)

- bats tests/unit/lib/test_update.bats (347/347)

- ./scripts/check-manifest-drift.sh

- git diff --check

- ubs scripts/lib/continue.sh tests/unit/test_changelog_export_status.sh (exit 0; scanner reported no recognizable languages for these shell files)
Make the status-check wizard page explicit that only acfs doctor is required before continuing. The doctor command now has a required-specific checkbox label, while auth command cards use optional login labels so users do not infer that Codex, Gemini, or cloud logins block progress.

Extend CommandCard with caller-supplied checkbox/completed labels instead of forcing every checklist item to read as a required command completion. This keeps existing defaults for the rest of the wizard while letting optional tasks present accurate copy.

Strengthen the status-check Playwright regression to verify optional login checkboxes are visible, remain unchecked, and the final continue button unlocks after only the doctor checkbox is marked complete.
Fresh-eyes follow-up on the status-check auth guidance. The beginner guide still told users to check every auth command as completed, which contradicted the visible optional-login framing and could recreate the same user confusion inside the expanded help text.

Replace that sentence with explicit optional-note language and extend the existing status-check Playwright regression to assert the visible optional-note copy before proving that only the doctor checkbox unlocks the final step.
Fresh-eyes pass on the wizard account step after the optional-login fixes. The page let users continue without checking recommended or optional services, but each service card still labeled its checkbox as Authenticated. That was inaccurate before installation and made non-essential account items feel like hard auth gates.

Rename the service checkbox labels to reflect account-signup tracking: essential services use Signed up, while recommended and optional services use Optional signup. Tighten the page and beginner-guide copy so the essential tier is the only do-now work and later tiers are explicit notes.

Add a Step 7 Playwright regression that expands the recommended tier, verifies optional signup tracking remains unchecked, and continues to pre-flight successfully.
Fresh-eyes pass on the launch-onboarding page after the accounts and status-check optional-auth fixes. The final page still said users needed to authenticate AI coding assistants broadly, even though the intended beginner path is to start with Claude Code and defer Codex or Gemini until those accounts are actually needed.

Change the final auth panel to say users should authenticate the AI tools they plan to use, start with Claude Code, and leave Codex/Gemini for later if they are not using those accounts. Add a Step 13 Playwright regression that asserts the new copy and rejects the old blanket authenticate-them phrasing.
…a editor

Reshapes the wizard's auth-status step around how operators actually
use these tools on a fresh VPS. Three connected fixes:

1. GitHub auth surfaces as a "developer tools" recommendation,
   not a buried optional.

   `getAuthServices` ordering on the status-check page was
   `[access, agent, cloud]`, which pushed `gh auth login` into the
   "access" group alongside Tailscale. Operators routinely missed it
   and then couldn't push their first commit. The new ordering is
   `[devtools, agent, access, cloud]` and GitHub is flipped to
   `installedByAcfs: true` in `lib/services.ts` (it isn't merely
   "needs git config" — ACFS now installs `gh`). The intro copy on
   the page is rewritten so the "Recommended now" line reads
   "GitHub CLI and Claude Code" instead of just "Claude Code", and
   Tailscale moves to the "Optional now" line because most agent
   work doesn't require an inbound mesh.

2. Tier-aware checkbox labels.

   `CommandCard` previously hard-coded "Optional: I logged in to
   this tool" / "Optional login completed" for every auth step,
   which contradicted the new "Recommended" framing for `gh` and
   `claude`. Two helper functions (`getAuthCheckboxLabel`,
   `getAuthCompletedLabel`) now branch on `service.tier` —
   essential services render "Recommended: I logged in to this
   tool" / "Recommended login completed", everything else keeps
   the old "Optional" copy.

3. Persist secrets via $EDITOR instead of `export`.

   The previous Gemini and Cloudflare post-install commands were
   `export GEMINI_API_KEY="your-gemini-api-key"` and
   `export CLOUDFLARE_API_TOKEN="your-token-here"`. That's wrong on
   a headless VPS in two ways:
     - The `export` only lasts the current shell. After SSH
       reconnects the operator has to redo it, which the wizard
       never re-prompts for.
     - The placeholder strings are obvious copy-paste hazards;
       operators have left literal `your-gemini-api-key` in their
       env and then debugged "auth fails" for 20 minutes.
   The new commands open the canonical persistent location in
   $EDITOR (falling back to nano):
     - Gemini: `mkdir -p ~/.gemini && ${EDITOR:-nano} ~/.gemini/.env`
     - Cloudflare: `${EDITOR:-nano} ~/.zshrc`
   The on-page description still tells the operator which variable
   to add. This pattern matches what Codex and Anthropic CLIs
   already do via their config files, so the wizard is consistent.

Plus E2E coverage:

   `apps/web/e2e/wizard-flow.spec.ts` adds a test asserting:
     - The "Developer Tools" section is visible.
     - `gh auth login` is rendered as a command card.
     - The page copy now mentions "GitHub CLI and Claude Code".
     - The card carries the "Recommended: I logged in to this tool"
       label.
     - The continue button stays disabled until flywheel-doctor is
       checked, so the recommendation is visibly non-blocking.

   The existing launch-onboarding test is extended to assert the
   new `mkdir -p ~/.gemini` command shows up and the old
   `your-gemini-api-key` placeholder is gone.

No behavior change for tools that aren't auth-status-page-rendered.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

⤵️ pull merge-conflict Resolve conflicts manually

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant