Nightly is a host-native autonomous coding agent that runs inside the
coding CLI you already use — Claude Code, Codex, opencode, Cursor,
Antigravity, or Gemini CLI — and turns it into a self-directed,
drainable session. You stop coding at 5pm, fire /nightly, and wake up
to a stack of draft PRs in review-ready shape: each on its own
worktree, each with its own commit history, each tied to a morning
briefing.html that tells you what landed, what's blocked, and what
needs your eyes.
Scope: Nightly opens draft PRs and writes a morning briefing. A human still reviews and merges — the briefing is the review surface, and the refusal policy is what keeps the surface safe. Refused categories: destructive git, production state, external communication, network egress to unknown hosts, scope creep, and bypassing tests or type checks.
The overnight loop is the intended shape, but the same machinery fits any block of step-away time: a long lunch, a meeting marathon, a flight. Anywhere your editor would otherwise sit idle.
17:00 you stop coding → /nightly
the host's Stop hook keeps the
session alive across turn boundaries
17:00–07:00 Nightly walks the priority cascade:
in-flight plans → unblocked approvals → accepted RFCs →
ranked GitHub issues → PR rescue → ideation
each pick lands on its own isolated git worktree
refused operations are recorded, not silently retried
07:00 you wake up → open .nightly/runs/<id>/briefing.html
review draft PRs · merge what you like ·
reject what you don't · the briefing
is the review surface
The cascade never invents work it can't justify. When the backlog is
empty, the proposer suite scans for autofixable lint debt, TODO/FIXME
audits, and Any type holes; only the smallest of these auto-promote
to a draft PR (single file, < 80 LOC, category in {lint_debt, dep_upgrade}). Everything else lands as a proposal for you to
approve in the morning.
The mechanics — cascade ordering, plan lifecycle, worktree isolation, refusal policy, keep-alive plumbing — are documented in How it works below.
The recommended path is two steps: install the binary once, then drop
into each repo with /nightly-init from inside the host.
# 1. install the `nightly` binary + bootstrap uv if missing
curl -fsSL https://raw.githubusercontent.com/ulmentflam/nightly/main/install.sh | bash
# 2. install the host skill globally (default host = claude)
nightly init --scope userThat writes a nightly shim to ~/.local/bin/nightly and installs the
main /nightly skill plus four companions (/nightly-init,
/nightly-conclude, /nightly-update, /nightly-bug) under the host's
user-scope skill directory (e.g. ~/.claude/skills/,
~/.codex/skills/, ~/.gemini/commands/). From then on, in any repo:
> /nightly-init
/nightly-init shells out to nightly init against the current
directory: creates .nightly/, writes config.yml, installs the
project-scope skill files, merges the Stop-hook entry, and seeds the
autonomy contract into AGENTS.md / CLAUDE.md. Idempotent — safe to
re-run.
For other hosts, pass --host:
nightly init --host codex --scope user
nightly init --host opencode --scope user
nightly init --host cursor --scope user
nightly init --host antigravity --scope user
nightly init --host gemini --scope user # vanilla Gemini CLIThe installer is idempotent — re-run it to update. Override defaults
with NIGHTLY_HOME (clone target, default ~/.local/share/nightly),
NIGHTLY_VERSION (branch / tag / SHA, default main), NIGHTLY_BIN
(shim location, default ~/.local/bin), or NIGHTLY_REPO (git URL —
for forks).
brew install ulmentflam/tap/nightlyTagged releases land on the tap automatically; brew upgrade nightly
pulls the latest. Released versions live under
github.com/ulmentflam/nightly/releases.
After install, run nightly init --scope user as above.
git clone git@github.com:ulmentflam/nightly.git
cd nightly
make install # uv sync --all-packages
make check # ruff + Pyrefly + pytest
uv run nightly --help # or `source .venv/bin/activate && nightly --help`For cron, CI, or genuinely-unattended overnight runs where no host session is open, skip the slash command and drive the cascade directly:
cd <some-repo>
nightly init # one-time per repo
nightly start # create a session
nightly task add-retry -d "Add retry budget to auth client"
nightly run --concurrency 2 --max-tasks 5 # multi-task headless dispatch
nightly brief # render briefing.html + vault
nightly vault open # open the knowledge-graph dashboard/nightly-init writes these into every host alongside the main skill,
so the overnight loop, the wind-down, and the bug-bundle are all one
keystroke from inside the CLI:
| Command | Purpose |
|---|---|
/nightly |
Start (or continue) a Nightly session — walks the cascade. |
/nightly-init |
Bootstrap Nightly in the current repo — runs nightly init. |
/nightly-conclude |
Wind down the running session — human-only off-ramp. |
/nightly-update |
Pull the latest Nightly release; refresh skills + hooks. |
/nightly-bug |
Bundle run state into a debug report (file as issue). |
Because everything Nightly knows lives on disk under .nightly/, the
host is interchangeable. Suspend a Claude Code run, resume it in Codex
the next evening, render the briefing from opencode — same tasks, same
plans, same vault. The three primary hosts support full headless
dispatch; the three secondary hosts ship the launcher and the
morning briefing, with their headless story deferred to a remote
queue.
| Host | Tier | Skill installed at | Sub-agent dispatch | OS sandbox |
|---|---|---|---|---|
| Claude Code | primary | .claude/skills/nightly/SKILL.md |
Task tool + MCP | none (in-proc) |
| Codex CLI | primary | .codex/skills/nightly/SKILL.md |
MCP / codex exec |
Seatbelt + Landlock |
| opencode | primary | .opencode/agents/nightly/SKILL.md |
POST /session/:id/fork + SSE |
none |
| Cursor | secondary | .cursor/commands/nightly.md |
Background Agents (cloud VM) | cloud VM (Background) |
| Antigravity | secondary | .gemini/antigravity/agents/.../SKILL.md |
Agent Manager + brain/<GUID>/ |
none |
| Gemini CLI | secondary | .gemini/commands/nightly.toml |
Headless gemini --prompt |
none |
Install per host with nightly init --host <name>. Switch scopes with
--scope user (global) vs the default --scope project. Subscription
auth propagates from the host's cached creds (~/.claude/,
~/.codex/, ~/.local/share/opencode/, ~/.gemini/, …) — Nightly
never asks for an API token. ANTHROPIC_API_KEY / OPENAI_API_KEY /
GEMINI_API_KEY etc. are env-var fallbacks for sandboxed CI.
antigravity and gemini are distinct hosts sharing the .gemini/
namespace: Antigravity writes managed-agent files under
.gemini/antigravity/agents/ (desktop IDE); vanilla Gemini CLI writes
custom-command TOML under .gemini/commands/. Both register an
AfterAgent Stop-style hook against .gemini/settings.json — the
merge is idempotent if you co-install them.
Coding CLIs end their session the moment the model finishes its first response. The overnight loop only loops because Nightly registers a host-level Stop-equivalent hook that catches that boundary and re-injects a "continue" prompt. Five of the six hosts get a real hook; opencode is soft (rule-text only — the model is asked to never stop):
| Host | Hook |
|---|---|
| Claude Code | Stop |
| Codex CLI | Stop |
| Cursor 1.7+ | stop |
| Antigravity | AfterAgent |
| Gemini CLI | AfterAgent |
| opencode | (soft / rule-text) |
The hook checks a SESSION_ACTIVE marker on disk and re-injects a
"continue on X" prompt at every turn boundary. The marker has a 4-hour
TTL; nightly session start refreshes it. The human off-ramps
(nightly conclude, nightly stop, Ctrl-C) take precedence and end
the session cleanly.
Context hygiene (v0.0.12). Each Stop-hook firing also estimates the
session's current context size from the Claude Code transcript and logs
ctx=<N> in keepalive.log; nightly status surfaces it as a
"context: ~NK tokens" line. When the estimate exceeds the soft budget
(default 256K; context.budget_tokens in .nightly/config.yml), the
injected prompt gains a "context diet" block nudging the agent to lean
on the session digest, background heavy work to specialists, and avoid
dumping long output inline. nightly init also installs a second hook —
SessionStart(compact) — that fires after any compaction (auto or
manual /compact) and re-injects the session digest as
additionalContext so key Nightly state survives the compaction. The
digest lives at .nightly/runs/<id>/digest.md and is refreshed every
keepalive turn (configurable via context.digest_every_turns).
- Priority cascade — picks the next task automatically by walking a
fixed precedence: resume in-flight plans → unblocked-by-approval plans
→ accepted RFCs in
.planning/rfcs/→ highest-ranked open GitHub issue (viagh) → PR rescue (new review feedback on open Nightly PRs) → ideation (proposer suite) → terminal nothing. - Proposer suite — when the backlog is empty, scans for TODO/FIXME
audits, autofixable lint debt (ruff), and
Anytype holes; writes ranked draft issues to<run>/proposed/issues/for human review. - Cascade PR-awareness — the cascade skips RFC checkbox items whose text appears in an open Nightly PR's title or body, so the agent doesn't re-pick work that's already awaiting review. Both signals are best-effort substring matches with a bias toward false negatives.
- Per-task worktrees — every task lives in its own
git worktreeforked from a base branch, so concurrent dispatches cannot stomp on each other. - Specialists — four sub-agent roles (
implementer,tester,reviewer,researcher) with their own context windows, dispatched through each host's native primitive. - Headless parallelism —
nightly rundrives the cascade in cron or CI by spawningclaude -p/codex exec/opencode rundirectly. Opt-in--concurrency Nparallelism viaasyncio.gatherplus worktrees. - Worktree readiness — before any task dispatch,
nightly worktree doctorprobes the repo's pre-commit infrastructure.missing_python_depandmissing_pre_commit_hookare auto-remediated; other failures surface as aworktree_remediationproposal so a broken worktree can't silently waste a session turn. - Stacked-PR prevention — RFC 004 §C prevents accidental PR chains
by forcing each new worktree to branch from
main. A task can opt into a stacked geometry by declaringdepends_on_pr: <N>in its plan frontmatter; the driver then bases the worktree on PR #N's branch and instructs the agent to begin the PR body withDepends on #<N>. The morning briefing renders an all-declared chain with a teal "declared dependency chain" panel and any accidental geometry with the existing rose "stacked PR geometry" panel (RFC 001 §B2) so reviewers can distinguish the two at a glance.
- Six-category refusal policy — destructive git, production state, external comms / publishing, network egress to unknown domains, scope creep, bypassing test or type safety. Refused operations are recorded retro (non-blocking) for review in the morning briefing.
- Autonomy bar — proposals are auto-promoted only when single
file, < 80 LOC, and category in
{lint_debt, dep_upgrade}. Everything else waits for human approval. - Cooperative drain —
nightly concludewrites a marker the loop honours at the next batch boundary. NeverSIGKILL. Half-finished work parks asstatus: parkedon a dedicated branch.
- Hybrid briefing — Python owns the deterministic structural
skeleton (hero counts, task pills, approvals list); the agent owns
three narrative slots (
briefing.md, per-tasknotes.md,lessons.md) that survive context compaction. This is the file you open at 07:00. - Knowledge graph (vault) —
nightly briefalso builds a navigable knowledge graph under.nightly/vault/: every run, task, lesson, and PR becomes a node; parent / spawned / derived_from edges form a DAG. Open the dashboard withnightly vault open— it runs in any browser, no server needed (sql.js + wasm are vendored). The dashboard surfaces cross-run patterns the briefing alone doesn't.
nightly --help lists everything; this is the operator-facing subset.
| Group | Command | Purpose |
|---|---|---|
| Setup | nightly init [--host <h>] [--scope project|user] |
Bootstrap .nightly/ + install the host launcher. |
nightly status |
Show repo state, installed hosts, current run. | |
nightly uninstall [--host <h>] [--scope ...] |
Remove the host launcher. | |
nightly doctor |
Repair a drifted install (scaffold, config, rules, skills). | |
nightly update [--version <ref>] [--dry-run] |
Self-upgrade Nightly's source + refresh installed hosts. | |
nightly version |
Print the installed Nightly version. | |
nightly info |
One-liner description + where to start. | |
| Run lifecycle | nightly start ["<seed task>"] |
Create a new run; optionally seed tasks/0001-<slug>/. |
nightly task <slug> [-d "<desc>"] |
Add a task to the current run. | |
nightly conclude |
Mark the current run as concluding (non-blocking drain). | |
nightly stop |
Immediate hard-stop request — the next turn boundary ends. | |
nightly session start / session stop |
Arm / disarm the Stop-hook keep-alive marker. | |
nightly brief [--run <id>] |
Render <run>/briefing.html. |
|
| Cascade | nightly next |
Walk the priority cascade; print the next pick + rationale. |
nightly triage [--top N] |
List ranked open GitHub issues (best-effort, needs gh). |
|
nightly plans |
Every plan across runs with status. | |
nightly specialist <role> |
Print the system prompt for one of the 4 roles. | |
nightly keepalive [--name <s>] |
Show think-harder strategies when the cascade goes empty. | |
| Ideation | nightly propose [--top N] |
Dry-run the proposer suite — list candidates. |
nightly ideate |
Run proposers; write draft issues to disk. | |
| Headless | nightly headless <prompt> [--host <h>] [--cwd <p>] [--timeout S] |
Single-shot host CLI invocation. |
nightly run [--host <h>] [-n N] [-j K] [--timeout-per-task S] |
Drive the cascade in headless mode; opt-in parallel. | |
| PRs & CI | nightly feedback [--branch <b>] [--apply] |
Show PR review feedback; --apply lands it on the matching plan. |
nightly rescue |
Preview the next PR-rescue candidate (Nightly-authored, new feedback). | |
nightly ci |
Print CI status across open Nightly PRs. | |
nightly verify |
Detect & run the repo's linters / formatters / type checkers. | |
nightly bug |
Bundle run state into a debug report; optionally file an issue. |
make help covers the dev-loop side: install, sync, lint, fmt,
type, test, check, install-hooks, pre-commit, brief,
planning, clean, nuke.
Everything Nightly writes lives in one place:
.nightly/
├── config.yml # refusal policy, hosts, branch prefix, budgets
├── plans/ # per-task plans (currently empty; reserved)
├── runs/
│ ├── CURRENT # pointer at the active run id
│ └── <run-id>/
│ ├── tasks/<NNNN>-<slug>/
│ │ ├── plan.md # YAML frontmatter: status, slug, created
│ │ ├── proposal.md # PR/proposal body
│ │ ├── uncertainty.md
│ │ ├── notes.md # per-task narrative slot
│ │ └── diff.patch
│ ├── proposed/
│ │ ├── approvals/ # refused-operation records
│ │ ├── planning/ # draft RFCs / ADRs
│ │ └── issues/ # ideation candidates
│ ├── briefing.md # session narrative slot
│ ├── lessons.md # lessons-learned slot
│ ├── briefing.html # rendered morning briefing
│ ├── digest.md # compact key-state digest (re-injected after compaction)
│ ├── keepalive.log # turn-boundary heartbeats (ctx= field added v0.0.12)
│ ├── keepalive.context # latest context-size estimate in tokens
│ └── CONCLUDE # sentinel — drains on next loop iteration
├── atlas/ # repo wiki (scaffolded; rolling refresh deferred)
└── memory/ # cross-session memory (scaffolded; reserved)
Human-authored design intent (RFCs, ADRs, the brainstorm itself) lives
in .planning/. Nightly reads it on every cold start (it's a context
source alongside AGENTS.md / CLAUDE.md) but never writes to it
on its own — with one explicit exception (RFC 005). When the operator
invokes /nightly interactively with a feature seed, the host skill
calls nightly seed-rfc "<title>" to stub a new accepted RFC under
.planning/rfcs/ carrying author: nightly-seed in the frontmatter.
The cascade then picks the stub's unchecked items via the standard
accepted_rfc slot — same shape as hand-authored RFCs 001–008,
distinguishable on retro audit by the author field. The
seed-stubbed pathway is opt-in (the skill decides based on seed
shape) and never fires for one-line bugfix seeds, which keep using
the nightly start <seed> single-task pathway.
1. resume_in_flight — plans with status: in_progress
2. unblocked_approval — plans with status: blocked: approval + approval_granted
3. accepted_rfc — RFCs in .planning/rfcs/ with unchecked tasks
4. github_issue — highest-ranked open issue via `gh`
5. pr_rescue — Nightly-authored open PR has new feedback (reviews,
bot comments, or failed CI) since last reconcile
6. ideate — proposer suite, top auto-PR-eligible result
7. nothing — terminal; write narrative + brief + exit
Run nightly next at any time to see what the cascade would pick.
ready → in_progress → dispatching → done
↘ parked
↘ blocked: approval
dispatching is a transient sentinel the driver uses to claim a plan
during multi-task parallel dispatch — the cascade explicitly skips it.
nightly run walks the cascade. For each pick (up to --concurrency N in parallel) it:
- Claims the plan via
status: dispatching. - Calls
nightly worktree create <slug>to place an isolated git worktree at the config-aware, iCloud-safe location. - Spawns the host's headless CLI (
claude -p --output-format json, etc.) with the task prompt + working directory set to the worktree. - Reconciles: if the agent updated the plan to
done/parked, respect that. Otherwise infers from the headless exit code. - Loops until the cascade returns
nothing,--max-tasksis hit, or<run>/CONCLUDEappears on disk.
Single-process by contract: two concurrent nightly run invocations
against the same repo can race on plan-status updates.
.
├── Makefile # dev-loop entrypoints
├── README.md
├── LICENSE # MIT
├── pyproject.toml # uv workspace + tool configs
├── packages/
│ ├── nightly-core/ # loop · cascade · drain · briefing · proposers · headless
│ ├── nightly-host-claude/ # primary host
│ ├── nightly-host-codex/ # primary host
│ ├── nightly-host-opencode/ # primary host
│ ├── nightly-host-cursor/ # secondary host
│ ├── nightly-host-antigravity/ # secondary host (Antigravity IDE)
│ └── nightly-host-gemini/ # secondary host (vanilla Gemini CLI)
├── .planning/ # human-authored design
│ ├── brainstorm.html # the full design doc (open with `make brief`)
│ ├── decisions/
│ └── rfcs/ # RFCs 001–008 accepted; 009 planned
└── .nightly/ # agent runtime state (gitignored by default)
Each host package implements NightlyHostIntegration from
nightly_core.contract — five methods cover the launcher lifecycle
(install / uninstall / is_installed / session_id /
auth_status) and three more cover runtime (run_headless for the
primaries; dispatch_sub_agent / request_approval reserved for
future work). Adding another host is one new package + one entry in
_HOST_LOADERS.
make install # uv sync --all-packages (creates .venv)
make install-hooks # arm the pre-commit hook (ruff + pyrefly on every commit)
make check # ruff (lint) + Pyrefly (types) + pytest
make test # just pytest
make lint # ruff check
make type # Pyrefly type-check
make fmt # ruff format (write)
make pre-commit # run every pre-commit hook against every file
make brief # open .planning/brainstorm.html in the browser
make clean # remove ruff / pyrefly / pytest caches
make nuke # clean + drop the venvThe dev loop is Python 3.12+ · uv · ruff · Pyrefly · pytest. Tests cover all six hosts plus the core (run lifecycle, cascade, proposers, autonomy bar, headless, worktree, driver, CLI). The full check suite runs in ~3 seconds.
make install-hooks arms a pre-commit hook
that runs ruff check and pyrefly check on every git commit. Tests
are deliberately off the hot path — they live in make check and CI.
To bypass for an in-progress WIP commit (emergencies only — CI still
enforces the merge gate): git commit --no-verify.
- Scaffold a new package under
packages/nightly-host-<name>/following the pattern ofnightly-host-codex/. - Implement
<Name>HostIntegration(NightlyHostIntegration)—install/uninstall/is_installed/session_id/auth_statusare required;run_headlessis required for non-interactive use;dispatch_sub_agent/request_approvalmay stayNotImplementedErroruntil you wire them. - Ship a
skill.mdwith host-specific dispatch and sandbox notes. - Register the loader in
nightly_core.cli._register_host_loaders. - Add tests under
packages/nightly-host-<name>/tests/.
- Subclass
Proposer(nightly_core.proposers.base) with a uniqueidand apropose(root)implementation returningIterable[Proposal]. - Choose a category from
ProposerCategory— onlylint_debtanddep_upgradeclear the autonomy bar by default. - Add it to
default_proposers()inproposers/registry.py. - Add tests.
The full design — architectural decisions, prior-art research, the
refusal-policy rationale, host-comparison matrices, references — lives
in .planning/brainstorm.html, with
incremental decisions captured as RFCs 001–008 in
.planning/rfcs/. After cloning, open the
brainstorm with:
make briefThe brainstorm covers the architecture, state machine, host-comparison
matrix, refusal policy, and prior art (Devin · OpenHands · SWE-agent ·
Sweep · AutoCodeRover · Copilot · Factory · Replit · Amp · Cosine and
others) with inline references throughout. RFC 009 (synthesis-driven
ideate — codebase-wide proposals across cleaning/refactoring/
housekeeping/convenience/capability) is accepted and awaiting
implementation. RFC 010 (host-respawn supervisor) is drafted and
planned — v0.0.10 ships the underlying hook fix (the
stop_hook_active misread that caused sessions to surrender after
one force-continue is resolved; the hook now rides forced-
continuation chains indefinitely) and writes the RESPAWN_REQUESTED
marker preemptively during chains so an involuntary host-cap kill or
crash still leaves a resume breadcrumb, surfaced at the next
nightly session start. RFC 011 (interactive context compaction) is
shipped in v0.0.12 — see the "Context hygiene" note above and
.planning/rfcs/011-interactive-context-compaction.md.
MIT.