- bun (v1.2 or newer)
- Rust (stable, via rustup)
- Anchor (v0.31 or newer)
- Solana CLI
- GNU make
- Configure your git hooks:
git config core.hooksPath .githooks
- Install TypeScript dependencies:
bun install
- Build everything:
make
Build all packages (TypeScript and Anchor):
make build
Build TypeScript only:
make build-ts
Build Anchor program only:
make build-anchor
Build a specific TypeScript package:
make packages/<package-name>
Run all lint checks:
make lint
TypeScript only:
make lint-ts
Rust only:
make lint-anchor
Auto-format all files:
make format
Run all tests:
make test
TypeScript tests only:
make test-ts
Anchor tests only:
make test-anchor
Remove build artifacts:
make clean
- Squads v4 does NOT expire un-executed proposals. A fully approved proposal stays executable indefinitely unless cancelled. Multisig config changes invalidate un-approved proposals only; see "Vault-stale quirk" below.
- Duplicate-proposal guard. The guard in
bin/program-deploy,bin/program-rollback,bin/program-verify,bin/program-close, andbin/program-initial-deployaborts hard if any open vault proposal already targets the program upgrade authority. There is no override flag. Operators resolve the existing proposal (execute or cancel) via the Squads UI before retrying. - Vault-stale quirk. Changing multisig membership or threshold invalidates
only un-approved proposals. An already-approved vault proposal remains
executable across config changes —
vault_transaction_executehas no staleness check. The duplicate-proposal guard catches this as a backstop: it walks the full lifetime range and flags any stale-Approved proposal targeting the program. Operator hygiene is still toproposal_cancelbefore retiring members with in-flight upgrade approvals — surfacing the issue at decommission is cheaper than surfacing it at the next release. - Time lock semantics. Per-multisig, expressed in seconds; gates the Approved → Executable transition. Mainnet default: 86400 (24h). Devnet default: 0. Maximum: 7,776,000 (90 days).
bin/program-squads-approve and bin/program-squads-execute are
thin operator-CLI wrappers around the @sqds/multisig
proposalApprove and vaultTransactionExecute SDK calls. They are
the terminal equivalents of "Approve" and "Execute" in the Squads
web UI — useful when an operator prefers the terminal, when a
deployment pipeline scripts approval explicitly, or when the Squads
UI is unavailable.
bin/program-squads-approve <cluster> <multisig> <tx-index>
bin/program-squads-execute <cluster> <multisig> <tx-index>
Both read OPERATOR_PAYER_KEYPAIR from the environment and accept
an optional --rpc-url <url> to override the cluster's default RPC
endpoint. Both emit a single JSON line on stdout describing the
transaction signature so the calling pipeline can capture and log it.
The operator-facing release flow (bin/program-deploy,
bin/program-rollback, bin/program-verify) deliberately stops
after submitting the proposal and prompts the operator to approve
and execute out-of-band. That prompt can be satisfied by:
- The Squads web UI — the default for human-driven releases on mainnet. Each member opens the UI, casts a vote, then any member clicks Execute once the threshold and time lock allow.
- These bin scripts — each member runs
bin/program-squads-approve <cluster> <multisig> <tx-index>to record their vote, then any member (or a fee-paying automation keypair) runsbin/program-squads-execute <cluster> <multisig> <tx-index>.
The two paths are interchangeable for any individual proposal; one member can vote via the UI while another votes via the CLI. Once the threshold count of approvals is reached (across any combination of sources) and the time lock has elapsed, the proposal is executable by any signer.
bin/program-squads-execute always sends the execute transaction
as a versioned-v0 message because the SDK may return
address-lookup-table accounts that the Squads program references at
execute time; legacy transactions cannot carry ALT references.
Membership is enforced on-chain by the Squads program. The bin
scripts do not pre-check the supplied keypair against the multisig's
member list; passing a non-member keypair to
bin/program-squads-approve produces a clear SDK error from the
on-chain rejection.
bin/program-squads-cancel <cluster> <multisig> <tx-index> casts a
Cancel vote on a vault proposal as $OPERATOR_PAYER_KEYPAIR. Squads
v4 counts Cancel votes against the same threshold as Approve votes:
once a threshold of members has voted Cancel, the proposal
transitions to the Cancelled state and is no longer executable.
The duplicate-proposal guard in bin/program-deploy,
bin/program-rollback, bin/program-verify, bin/program-close,
and bin/program-initial-deploy treats Draft / Active / Approved
proposals targeting the program as blocking — there is no override
flag. Cancelling a stuck proposal is how the operator clears that
guard. Use it when:
- A proposal was approved before a multisig config change (membership rotation, threshold change) staled it, and Squads v4's "vault-stale quirk" keeps it executable — leaving the duplicate-proposal guard to flag it at the next release attempt against this program. Cancel clears that flag.
- The team decided to abandon a proposed upgrade (e.g. composed
against a stale
.soor with the wrong tag) before a quorum signed off, but at least one approval has already been recorded. - A devnet rehearsal needs to retire stranded proposals between iterations.
Cancel takes the same --rpc-url <url> override as approve and
execute, and emits a single JSON line with the transaction signature
on stdout.
scripts/src/squads.config.ts ships with placeholder values for
multisig, members, threshold, vaultIndex, verifyMode, and
timeLock on both clusters. The placeholders satisfy
assertValidConfig at module load (so tsc and make test-unit pass
out of the box), but every value must be replaced with real,
operator-supplied configuration before invoking bin/bootstrap-squads
or any downstream release script.
The provisioning sequence per cluster is:
- Replace the
membersarray with the real signing keys for that cluster and setthresholdto the agreed M-of-N. SetvaultIndexto0unless multiple vaults are intentional. SetverifyModeandtimeLockper cluster policy (the defaults in this file are reasonable starting points: devnetbatched/0, mainnetseparate/86400). - Leave
multisigas the placeholder for the first run — the value is unknown until the multisig exists. Runbin/bootstrap-squads <cluster>; the script printsMULTISIG_ADDRESS=…once the on-chain create lands. - Replace the cluster's
multisigfield with the printed address and commit the change. Downstream scripts (bin/program-deploy,bin/program-rollback,bin/program-verify) read this value as the upgrade authority.
bin/bootstrap-squads re-validates timeLock against MAX_TIME_LOCK
at runtime, so a value the schema would accept but the Squads program
rejects (e.g., a timeLock exceeding 90 days set after the file was
last imported) fails before the create transaction is signed.
The first deploy of the Flex program to a new cluster (devnet rehearsal,
then mainnet) is run as bin/program-initial-deploy <cluster>. The
script enforces a four-phase split between deploying the program,
handing upgrade authority to Squads, proving the new authority chain
works, and destroying the program keypair secret. Behaviour is
identical between devnet and mainnet; only the configured Squads
multisig address and vault index differ.
solana program deploy --upgrade-authority <vault-pda> would assign
upgrade authority to the Squads vault PDA in the same transaction that
creates the program. This is permanently rejected by the script. A
mis-derived vault PDA (wrong multisig address, wrong vault index,
typo in a copy-pasted value, or stale data from an aborted
bootstrap-squads run) becomes the new authority instantly and
irrecoverably: no one can sign as the bogus PDA, and the program is
bricked. Splitting the procedure into a deploy-then-handoff sequence
gives the operator a recovery window in which the upgrade authority is
the operator's own payer; the handoff is gated by a strong
confirmation prompt that requires the operator to retype the vault PDA
before the set-upgrade-authority invocation runs.
The script invokes solana program deploy with the operator's payer
keypair (OPERATOR_PAYER_KEYPAIR environment variable) as both the
fee payer and the implicit upgrade authority. It then fetches the
deployed program bytes via the cluster RPC, sha256s them, and asserts
equality with the local target/deploy/flex.so. A mismatch aborts.
The vault PDA is re-derived locally from
scripts/src/squads.config.ts (multisig + vault index) via the
getVaultPda helper in scripts/src/squads.ts — not trusted from
any pasted value.
The script prints the derived PDA on a dedicated line and pauses for
a confirmation prompt that requires the operator to retype the PDA
exactly. Only on a match does the script invoke
solana program set-upgrade-authority --new-upgrade-authority <pda>.
After the invocation the script reads ProgramData from the cluster
and asserts that the on-chain upgrade_authority equals the derived
PDA; mismatch aborts.
The script writes an upgrade buffer with the same .so bytes via
solana program write-buffer, transfers buffer authority to the vault
PDA, composes a one-instruction upgrade proposal via the
buildUpgradeIx helper in scripts/src/bpf-loader-ix.ts and the
createUpgradeProposal helper in scripts/src/squads.ts, and submits
the proposal-creation transaction signed by the operator payer.
The proposal PDA, transaction PDA, buffer address, and Squads UI URL
are printed. The operator (with a quorum of co-signers) approves and
executes the proposal manually in the Squads UI. The script does not
poll Squads for approval state — that is a bin/program-deploy
responsibility. After the operator confirms execution, the script
polls the deployed program data sha until it matches the local .so
sha. Timeout aborts with the proposal PDA captured so the operator
can investigate.
A successful Phase 3 proves that:
- The Squads vault PDA holds upgrade authority.
- A multisig quorum can sign and execute an upgrade end-to-end.
- The on-chain bytes after a Squads-mediated upgrade match the
locally built
.so.
After Phase 3 succeeds, the script logs the witnessed key-destruction
procedure to stderr and prompts the operator to confirm they have
read it. The script does NOT destroy the keypair file; the lifecycle
of target/deploy/flex-keypair.json (relocation between Phases 1–3,
destruction after Phase 3) is the operator's responsibility.
The procedure is:
-
Identify the keypair file at
target/deploy/flex-keypair.json(or wherever the operator relocated it). Confirm the file is the active program keypair, not an unrelated wallet file. -
Conduct the destruction with a second team member present as witness.
-
Cryptographically wipe the file. Choose the command matching the host operating system:
# Linux shred -uvz target/deploy/flex-keypair.json # macOS rm -P target/deploy/flex-keypair.json -
Verify the file no longer exists:
test ! -e target/deploy/flex-keypair.json && echo "destroyed" -
Record in the operational runbook the date and time (UTC), the witness's name, the destruction method (
shredorrm -P), the program ID, and the cluster of the initial deploy.
The committed pubkey under keypairs/flex-program.pub and the
declare_id! in programs/flex/src/lib.rs continue to reference the
destroyed keypair's public half; only the secret half is destroyed.
After Phase 4, Squads is the sole holder of upgrade authority, and
program-ID retirement becomes permanent — the program identity can
no longer be re-created from the original secret.
Devnet is the proving ground. Custody, procedure, and key shape on devnet match mainnet — no shortcuts. The operator-payer keypair, the three multisig members, the time-lock value, the verify mode, and the GPG signing key are all expected to be the same kind of long-lived, real-custody artifact on devnet as they are on mainnet. Throwaway airdrop-funded file keypairs are explicitly the wrong custody shape for devnet rehearsal: a rehearsal that skips the custody step does not actually rehearse the release.
After the initial deploy has succeeded and the Squads vault PDA holds
upgrade authority for both clusters, every subsequent release runs
through bin/program-deploy <cluster>. The script is the single entry
point; do not invoke solana program upgrade, solana-verify, or any
of the helper subcommands directly during a release.
OPERATOR_PAYER_KEYPAIR=/path/to/payer.json \
MAINNET_RPC_URL=https://my-paid-rpc.example.com/... \
FLEX_RELEASE_GPG_KEY=<key-id-or-fingerprint> \
bin/program-deploy mainnet
FLEX_RELEASE_GPG_KEY is required: signArtifact refuses to fall
back to gpg's default identity, because falling back would silently
sign release artifacts with whatever key the operator's gpg keyring
elects, producing an artifact whose signer is ambient rather than
committed. Use the key fingerprint or any unambiguous user-id that
gpg --local-user accepts. If the key is not available the
publish-release step (step 15) aborts before any artifact is
attached to the GitHub Release.
Available flags:
--verify-mode=batched|separate— overrides the per-cluster default fromscripts/src/squads.config.ts. Read the next subsection before changing this.--payer <signer>— any Solana signer URL (filesystem path,usb://ledger?key=…, or env-sourced). Defaults to$OPERATOR_PAYER_KEYPAIR.--priority-fee <microlamports>— overrides the sampled fee.--rpc-url <url>— overrides the per-cluster default.--allow-downgrade— skips the monotonic-version guard. Reserved forbin/program-rollback; do not pass it frombin/program-deployin normal operation.--tag <tag>— release tag for the GitHub Release artifact set; defaults to the latest annotated git tag in the working tree.
The script runs sixteen ordered phases, marked in
scripts/src/deploy.ts by // ---- Step <N>: <name> ---- block
comments (Step 10 through Step 160 in increments of 10). The
operator-visible lifecycle is:
- Flag parsing and config resolution — prints the cluster, RPC URL, program ID, multisig, vault PDA, verify mode, time-lock, and downgrade flag. State file path is announced (see below).
- Prerequisite check —
solana,solana-verify,gpg, and eitherghorGITHUB_TOKENmust be available. - Build —
solana-verify buildproducestarget/deploy/flex.so. - Local sha256 — recorded in the state file.
- Compare-to-deployed — if the deployed program already matches
the local
.so, the script exits cleanly with no proposal creation. - Duplicate-proposal guard — hard abort if any vault proposal targeting the program is open (Draft / Active / Approved). There is no override flag; resolve the existing proposal in the Squads UI before retrying.
- Monotonic-version guard — reads the deployed
FLEX_VERSIONstring and asserts the localCargo.tomlversion is strictly greater. Skipped only when--allow-downgradeis supplied. - Priority-fee sampling —
getRecentPrioritizationFeesis queried against the program, multisig, and vault PDA; the script uses the p75 of returned fees. Override with--priority-fee. - Write buffer —
solana program write-bufferwith the operator payer as the buffer authority. The buffer address is parsed from stdout. - Transfer buffer authority —
solana program set-buffer-authorityreassigns the buffer to the Squads vault PDA. - Assert buffer authority — the script re-reads the buffer account and verifies the authority field equals the vault PDA.
- Size guard — in batched mode, projects the execute-tx wire size and aborts if it exceeds 1100 bytes. See the next subsection.
- Compose proposal — builds the upgrade (and, in batched mode,
verify-init) instructions, wraps them via
createUpgradeProposal, and prints the proposal PDA, transaction PDA, Squads UI URL, and the earliest legal execute time (now + multisigtimeLock). The script then pauses: the operator must submit the proposal-creation transaction (signed by the proposer) and approve + execute the proposal in the Squads UI before typingsubmittedat the prompt. - Polling — after the operator confirms, the script polls the
on-chain program data sha until it matches the local
.sosha. The Squads tx signature is not a success signal; the post-execution sha match is. Backoff is exponential with full jitter, capped at 60 s, with a 30-minute overall deadline. - Publish release — GitHub Release is created with the
.so, IDL JSON, sha256 file, state file, and release metadata, each accompanied by a detached GPG signature. - OtterSec submission — batched mode only. Separate mode
leaves verify-init to
bin/program-verify <cluster>.
The Solana hard transaction size limit is 1232 bytes. The Squads
execute wrapper adds signature and account-meta overhead beyond the
inner instructions, so bin/program-deploy enforces an 1100-byte
guard on the projected execute-tx size (132 bytes of headroom).
In batched mode the proposal carries three instructions: upgrade,
close_buffer, and verify-init. When verify-init's account list
pushes the projected size over 1100 bytes, the script aborts with:
Projected execute-tx size: <N> bytes (limit: 1100). Re-run with --verify-mode=separate.
There is no silent fallback. The operator re-runs bin/program-deploy <cluster> --verify-mode=separate. In separate mode the first
proposal carries only upgrade + close_buffer; after the
post-execution sha match, the OtterSec submission and the second
proposal carrying verify-init are handled by
bin/program-verify <cluster>.
The default sampler queries getRecentPrioritizationFees against the
program, multisig, and vault PDA (the accounts the upgrade tx
write-locks) and uses the 75th percentile of returned fees. p50 is
too low under congestion (the upgrade lands intermittently); p99
over-pays without improving landing time.
Override manually via --priority-fee <microlamports> when:
- Recent fee samples are stale (the cluster has just resumed after an outage and the cache has not yet warmed up).
- The operator wants to land the buffer write deterministically during a known congestion event.
Symptoms that suggest raising the fee on retry: write-buffer
visibly dropping transactions in its log, or the Squads execute
transaction landing only after multiple submissions in the Squads
UI.
scripts/src/cluster.config.ts supplies the default URL per cluster.
--rpc-url <url> overrides per invocation.
- Mainnet requires a paid RPC provider with QUIC support. The
write-bufferphase submits 250+ transactions and silently drops under congestion against the public RPC; runningbin/program-deploy mainnetagainstapi.mainnet-beta.solana.comis not supported. SetMAINNET_RPC_URLin the environment, or pass--rpc-url. - Devnet accepts the public RPC at
api.devnet.solana.com. Paid providers still help under congestion but are not required.
Every state transition is written to
target/program-deploy/<cluster>-<timestamp>.state.json. This file
is the forensic evidence the operator inspects when recovery is
necessary. Do not delete state files until the corresponding GitHub
Release has been verified end-to-end.
Failure modes and the matching recovery procedure:
Symptom: the script exited at step 13 (compose_proposal) and a
proposal is sitting in the Squads UI awaiting approvals, or the
operator typed something other than submitted at the prompt.
Recovery:
- Inspect
proposal_pdaandsquads_urlfrom the state file. - In the Squads UI, either gather the remaining approvals and execute, or cancel the proposal.
- If executed: re-run
bin/program-deploy <cluster>from scratch; the deploy gate at step 5 will detect that the program already matches the local.soand exit cleanly, the post-execution verify-init / OtterSec submission can be completed viabin/program-verify, and the GitHub Release can be published by re-running with--tag <tag>(the publish-release step is idempotent on the GitHub side if the assets are unchanged; if it reports an existing release, attach the missing artefacts manually viagh release upload). - If cancelled: re-run
bin/program-deploy <cluster>cleanly. The duplicate-proposal guard at step 6 will block until the cancellation is fully on-chain.
Symptom: post-execution sha matched (step 14 succeeded) but the OtterSec submission at step 16 returned an error, leaving the program upgraded without an associated verify record.
Recovery:
- Inspect
execution_slotand the state file to confirm the upgrade actually landed. - Run
bin/program-verify <cluster>. It re-runs the OtterSec submission against the same program ID and vault PDA. - If OtterSec submission keeps failing, contact OtterSec; the on-chain upgrade is unaffected.
Symptom: step 9 (write-buffer) aborted partway through. The
solana program write-buffer output may report a buffer address but
the buffer is not fully written.
Recovery:
- Identify the partially-written buffer via the state file
(
buffer_address, if recorded) orsolana program show --buffers --buffer-authority <operator-pubkey>. - Reclaim the buffer rent:
solana program close <buffer-address> --recipient <operator-pubkey>. - Re-run
bin/program-deploy <cluster>. Steps 1–8 will replay safely; step 9 will create a fresh buffer.
Symptom: bin/program-deploy ran past step 11 (transfer buffer authority → vault PDA) but the upgrade proposal was never executed
— it was cancelled, abandoned, or composed with the wrong bytes and
manually retired. The buffer remains on-chain, owned by the
upgradeable loader, with the vault PDA as its authority. The
operator's solana program close cannot reclaim it because the
operator is no longer the buffer authority.
Recovery:
- Identify the orphaned buffer via the state file
(
buffer_address) orsolana program show --buffers --buffer-authority <vault-pda>. - Run
bin/program-close-buffer <cluster> <buffer-pubkey>. The script composes a Squads vault proposal carrying a singlebpf_loader_upgradeable::Closeinstruction (the 3-account buffer-close form) against the named buffer, with the vault PDA as both authority and rent recipient. The default recipient is the vault rather than an operator wallet so the reclaimed rent stays under multisig custody — drain it back to the payer (or any treasury) afterward viabin/program-vault-drain. - Approve and execute the proposal in the Squads UI or via
bin/program-squads-approve+bin/program-squads-execute.
Pass --recipient <pubkey> if the rent should land somewhere other
than the vault (rare; the default is correct for most cleanups).
This is the only sanctioned path for closing a vault-authority
buffer — there is no operator-direct equivalent of solana program close against a buffer the operator does not own.
Symptom: step 14 timed out without the deployed sha matching the
local .so sha. The proposal may have executed but the deployed
bytes differ from what was built locally.
Recovery:
- Inspect the state file's
proposal_pdaandlocal_sha. - Re-read the deployed sha against the program ID via
bun scripts/src/deploy.ts compare-deployed <cluster> <program-id> <expected-sha>. If the comparison shows a different sha than the locally built one, the proposal upgraded the program to bytes other than what was built — this is the most serious recovery scenario. - Verify the local build is deterministic by re-running
solana-verify buildand recomputing the sha. If the reproducible-build sha differs, an external input changed (Rust toolchain, dependency lock, etc.); reproduce the original build from the tagged commit. - If the deployed bytes are objectively wrong (a third party
approved a stale buffer), the correct response is a rollback
proposal via
bin/program-rollback <cluster>and a security review of the multisig membership.
bin/program-rollback <cluster> <tag> --yes-i-want-to-downgrade
reverts the deployed Flex program to the .so published by a prior
GitHub Release. Rollback is the exception, not the norm; the
default response to a broken release is a forward-fix.
Prefer a forward-fix whenever the bug is correctable with code changes:
- Author the fix on
main. bin/release patch(orminor/majoras appropriate) cuts a new tag whose version is higher than the broken one.bin/program-deploy <cluster>deploys the new tag. The monotonic-version guard accepts the upgrade because the newFLEX_VERSIONis strictly greater than the on-chain value.
A forward-fix preserves a clean, monotonic on-chain version history and is the auditable, recoverable path for almost every defect.
bin/program-rollback is reserved for situations a forward-fix
cannot address quickly enough:
- A live incident where the current release is actively unsafe and there is no fix ready to ship in the time available.
- A regression discovered only after deploy that requires reverting
to the last-known-good
.sowhile the underlying issue is investigated. - Recovery scenarios documented under "Post-execution sha mismatch"
above, where the deployed bytes do not match what was built and
the prior release's published
.sois the trusted reference.
If a forward-fix is feasible within the incident's time budget, take it. Rollback is intentionally inconvenient.
bin/program-deploy exposes an --allow-downgrade flag that
skips the monotonic-version guard. That flag is not part of the
normal release flow and is undocumented in bin/program-deploy's
help text as a normal-flow option. bin/program-rollback is the
only sanctioned caller; do not invoke bin/program-deploy --allow-downgrade directly.
Single-sourcing keeps every downgrade routed through the same gauntlet of banner, operator confirmation, artifact fetch, and sha-verification controls. A downgrade that bypasses this script bypasses those controls.
The on-chain FLEX_VERSION is embedded in the program by
programs/flex/Cargo.toml at build time. When a rollback lands,
that value moves backward: the version reported by the program
account becomes the version associated with the rollback tag,
which is older than the version that was previously deployed.
Consequences:
- A naive reader of the on-chain version cannot reconstruct history. The rollback tag's version equals some earlier release's version; the fact that the program was once at a higher version is not visible from program data alone.
- Reconstructing the full deploy history requires cross-referencing
the GitHub Release timeline (tag creation order) with the rollback
tag's
state.jsonartifact (which captures the rollback's execution slot) and the prior forward tag'sstate.json. - Always preserve the rollback's state file alongside the prior forward tag's state file in the incident record. Together they document "what version was on-chain before, what version is on-chain now, and when the transition happened."
The script will refuse to fetch the .so, refuse to invoke
bin/program-deploy, and refuse to make any network call without
the explicit --yes-i-want-to-downgrade flag. The flag is
intentionally verbose; muscle memory and abbreviated aliases should
not be able to trigger a downgrade.
Once confirmed, the script:
- Prints a prominent downgrade banner naming the cluster, the rollback tag, and the fact that the monotonic-version guard is being skipped.
- Fetches
flex.soandflex.so.sha256from the named GitHub Release viascripts/src/rollback-fetch.ts(which delegates the network calls tofetchArtifactfromscripts/src/github-release.ts). - Computes the sha256 of the downloaded bytes and compares it to
the published value in
flex.so.sha256. A mismatch is a hard abort with both shas reported in the error; there is no retry path and no override flag. - Re-prints the downgrade banner immediately before invoking
bin/program-deploy <cluster> --allow-downgrade --so-path <verified-path> --tag <tag>. - Logs the downgrade prominently at every step via
logger.warning(the@faremeter/logswarning channel), so the operator's terminal scrollback unambiguously records that a downgrade — not a normal release — was performed.
--so-path <path> substitutes step (2) above with a local file the
operator has already staged. The script still computes the sha256 of
the supplied bytes and logs it as part of the audit trail, but no
cross-check against a published flex.so.sha256 is possible — the
operator is asserting these are the right bytes (a stronger version
of the --yes-i-want-to-downgrade gate).
This path is intended for:
- Replays from a mirror or operator backup when the GitHub Release has been deleted, made private, or is otherwise unreachable.
- Rollbacks against releases that predate the
publish-releasestep inbin/program-deploy(noflex.so.sha256was ever published). - Devnet dress rehearsals where no GitHub Release was published because the rehearsal program ID is throwaway.
Mainnet rollbacks for live incidents should still take the default
fetch path so the published sha256 cross-check runs. Reach for
--so-path only when fetching is not an option.
bin/program-verify <cluster> --commit <rev> is the operator-facing
entry point for the OtterSec verification step of a Flex program
release. It composes and submits a standalone Squads proposal carrying
only the otter_verify::Initialize instruction (signed by the
operator payer just like the upgrade proposal in bin/program-deploy),
polls for the resulting otter_verify PDA once the multisig members
approve and execute it, and then POSTs the verification job to
verify.osec.io. The script is intentionally separate from
bin/program-deploy so the verify step can be re-run independently
when it fails for transient reasons.
--commit <rev> is required and accepts any git revision (tag, full
sha, branch). The resolved commit hash is what OtterSec pins the
verification record to; an unpinned record (the original default of
empty-string) would drift as the branch advances, so the flag refuses
to default. For the standard release path, pass the tag that
bin/program-deploy published: bin/program-verify mainnet --commit v0.3.0.
Three operator scenarios call for this script:
- Separate-mode follow-up.
bin/program-deploy --verify-mode=separateintentionally leaves the OtterSec submission for a later step so the upgrade proposal stays under the 1232-byte transaction-size limit. Once that upgrade proposal executes, runbin/program-verify <cluster>to land the standalone verify-init proposal and submit the job. - Retry after a failed OtterSec submit. If the HTTP call to
verify.osec.io/verify-with-signererrored (network glitch, 5xx, rate limit) after the verify-init proposal already executed, re-runbin/program-verify <cluster>. The script detects the existingotter_verifyPDA, skips proposal composition, and re-issues the HTTP submit. - Manually re-queueing verification. If the verify-init
proposal executed but the OtterSec worker never picked up the
job (worker outage, queue backlog), re-run
bin/program-verify <cluster>to re-issue the HTTP submit against the already-written PDA. No new proposal is composed.
The script does not touch program bytes or upgrade authority. It relies on these preconditions being already satisfied:
- The deployed program at
programs/flexhas been upgraded to the release the operator intends to verify (runbin/program-deployfirst, separately). - The on-chain sha256 of the deployed program matches the local
build, exactly as
bin/program-deploywould compute it. The OtterSec worker fetches the deployed bytes itself; any drift between deployed and source will surface as a worker mismatch rather than a script failure. - The Squads multisig and vault PDA derived from
scripts/src/squads.config.tsfor<cluster>are the upgrade authority of the program and the configured uploader for theotter_verifyPDA. The PDA seeds are["otter_verify", vault-pda, program-id]; any other uploader produces a different PDA and the script's idempotency check will not detect it.
bin/program-verify reads the on-chain otter_verify PDA at
startup. If the PDA already exists for the vault uploader and
program id, the script:
- skips the duplicate-proposal guard,
- skips proposal composition entirely,
- jumps directly to the OtterSec HTTP submit.
If the PDA does not exist, the script first runs the duplicate-proposal guard (rejecting if any open vault proposal targets the program — that proposal must be resolved through the Squads UI before retry), then composes a single-instruction verify-init proposal, polls for execution, and finally submits the OtterSec job.
bin/program-close <cluster> --i-want-to-close-this-program composes
a Squads vault proposal that invokes
bpf_loader_upgradeable::Close against the Flex program's
program-data account. On execution the lamports backing program-data
are reclaimed to the multisig vault and the program becomes
permanently uninvokable — the program ID cannot be re-deployed
because the program-data PDA was consumed by the original deploy and
the loader refuses a fresh deploy under the same ID.
This is the terminal retirement primitive. The default release flow never calls it. Reach for it only when:
- A devnet rehearsal program ID is being torn down at the end of the rehearsal (returns the rent so the next rehearsal starts with full payer balance).
- An old program ID is being sunset after migrating users to a new ID, and the operator deliberately wants to prevent any future invocation under the old ID.
- A security response requires permanent shutdown of a specific
program ID — for example, the deployed bytes are known
compromised, a forward-fix is not viable, and the rollback path is
unsafe because no prior known-good
.soexists for that ID.
The flow mirrors bin/program-deploy's shape: hard preflight,
duplicate-proposal guard, proposal composition + submission,
operator confirmation that approve + execute has happened
(via the Squads UI or bin/program-squads-approve +
bin/program-squads-execute), then poll for the on-chain close to
land.
program-close reads ProgramData.upgrade_authority on-chain and
asserts it equals the configured Squads vault PDA before composing
the proposal. A program whose authority is anything else cannot be
closed via this flow — there is no point composing a proposal the
multisig has no standing to execute. If the assertion fails the
script names both the expected vault PDA and the actual on-chain
authority in the error.
buildCloseProgramDataIx takes a recipient that receives the
program-data rent on close. The script always supplies the vault PDA
as both the authority (Squads-signed) and the recipient. Routing the
rent to the multisig vault rather than an individual operator keeps
the reclaimed lamports under the same collective control as the
program had — there is no operator who personally collects a payout
from a multisig-governed close.
Once executed, the program is gone. There is no "un-close" — the
program-data PDA can never be re-allocated under the same program
ID. The --i-want-to-close-this-program flag is intentionally
verbose to defeat muscle memory and abbreviated aliases. Operators
who reach for this script should pair it with an out-of-band
witness, the same way bin/program-initial-deploy Phase 4 does for
keypair destruction.
bin/program-vault-drain <cluster> [--recipient <pubkey>] composes
a Squads vault proposal containing a single
SystemProgram.transfer instruction moving the vault PDA's full
balance to the recipient. When --recipient is omitted, the
default is $OPERATOR_PAYER_KEYPAIR's pubkey.
The vault PDA is system-owned with zero data, so a full-balance
drain is safe — there is no rent-exempt reserve to preserve. The
proposal still has to be approved and executed by the multisig's
threshold of members via bin/program-squads-approve and
bin/program-squads-execute (or the Squads UI).
The two operational scenarios are:
- Post-close rent reclaim.
bin/program-closeroutes the released ProgramData rent into the vault (see "Why the recipient is the vault, not an operator" above).bin/program-vault-drainis the matching primitive for pulling that rent back to a human-controlled wallet once the close has executed. - Cleanup between operations. Buffer writes during a release
cycle accumulate rent in the vault until the upgrade executes (at
which point the buffer is consumed) — but a release that ends in
a vault-authority buffer needing
bin/program-close-bufferrecovery, or a devnet rehearsal that intentionally cycles through multiple buffer writes, leaves rent in the vault that the operator may want to recycle into the payer.
Unlike bin/program-deploy, bin/program-rollback,
bin/program-close, bin/program-verify, and
bin/program-initial-deploy — all of which run
guardNoOpenProposalsForProgram to refuse composing a new proposal
while an open proposal targets the program — bin/program-vault-drain
has no equivalent guard. The
existing guards filter listOpenProposals by program id, and a
vault-drain proposal targets the vault account rather than any
program; the Squads SDK does not expose a "list all open proposals
on this multisig" query that the guard could use without that
filter.
Practical consequence: re-running bin/program-vault-drain while a
previous drain proposal is still pending composes a second pending
drain for the same balance. Both proposals would attempt to transfer
the entire vault on execute; one would succeed and the second would
land an empty transfer (zero lamports left to move). Resolve any
stuck pending drain via bin/program-squads-cancel or the Squads UI
before composing a new one.
The recipient default is the operator payer because the most common
caller is an operator reclaiming their own deposited rent. Pass
--recipient <pubkey> when the lamports should land in a treasury
wallet, a different team member's keypair, or any address other
than the operator's. The recipient is validated as a base58 pubkey
before the proposal is composed; invalid input fails before any
on-chain transaction is sent.
OPERATOR_PAYER_KEYPAIR accepts either a filesystem path (back-compat
with the file-keypair flow) or a Solana signer URL. The supported URL
forms are:
usb://ledger?key=N— derivation path44'/501'/N'(the Solana CLI default for the same URL).usb://ledger?key=N&change=M— appends a fourth/M'level for operators who keep multiple deploy keys on the same device.
file:// URLs are intentionally not accepted. The solana CLI does
not recognise them at --keypair, so accepting them in the TypeScript
layer would create a quiet capability hole at the first solana program write-buffer shellout. Use a plain path instead.
The same value passes verbatim through to the solana CLI shellouts
(solana program write-buffer, set-buffer-authority,
set-upgrade-authority, and program deploy), each of which natively
understands usb://ledger?key=N.
- Install the Solana app on the Ledger device (Ledger Live → "Manager" → install "Solana").
- Open the Solana app on the device. The transport opens cleanly only while the app is running.
- Enable blind signing in the Solana app's on-device settings:
Settings → Allow blind signing → Enabled. The Ledger Solana app does not natively decode Squads instructions, so every release operation requires this setting. The first signing call without it fails with the human-readable messageMissing a parameter. Try enabling blind signature in the appsurfaced by@ledgerhq/hw-app-solana.
When LedgerSigner.open runs it reads the on-device blind-signing
flag via getAppConfiguration and prints a stderr warning at
transport-open time if the flag is off, so the operator catches the
misconfiguration before the first signing call rather than at the
device-confirm prompt.
Before any LedgerSigner signing call, the release tooling prints a verification block to stderr that mirrors what the Ledger Solana app displays in blind-sign mode. The block looks like:
================ Ledger blind-sign verification ================
About to sign: vaultTransactionCreate + proposalCreate (initial-deploy)
Targeting Squads program: SQDS4ep65T869zMMBKyuUq6aD6EgTu8psMjkvj52pCf
Instructions (in order):
1. SQDS4ep65T869zMMBKyuUq6aD6EgTu8psMjkvj52pCf :: vaultTransactionCreate
2. SQDS4ep65T869zMMBKyuUq6aD6EgTu8psMjkvj52pCf :: proposalCreate
Multisig: <multisig-pda>
Vault PDA: <vault-pda>
Transaction PDA: <tx-pda>
Proposal PDA: <proposal-pda>
Transaction index: 7
Compiled message sha256: f6d9...c4e2
Confirm this hash matches the one shown on your Ledger device.
================================================================
The Compiled message sha256 is sha256(message.serialize()) — the
same hash the Ledger Solana app computes for the bytes it is about to
sign. Visually comparing this hash against the device screen before
approving is the operator's defence against an attacker substituting
bytes between proposal composition and the device-confirm prompt.
This is the same posture Squads web-UI Ledger users have today, with
the addition that the CLI also names the multisig, vault PDA,
transaction index, and decoded instruction discriminator so the
operator can confirm "I am voting on tx #N for the right multisig"
rather than just "the hash is what I expect".
bin/program-deploy and bin/program-initial-deploy submit 250+
write-buffer transactions during a release. Each transaction is a
separate solana program write-buffer chunk, and each one prompts
the Ledger device for individual approval when the keypair is a
Ledger URL. Expect to confirm every chunk by hand; if this is the
wrong UX for a particular operator, supply a file-keypair URL for
buffer writes and reserve the Ledger for the proposal-create signing.
Subsequent steps — set-buffer-authority, the proposal-create
transaction, and (eventually) set-upgrade-authority during the
initial deploy — are single-transaction signing events.
sendWeb3Tx fetches a recent blockhash, builds the transaction,
prints the blind-sign verification block, and then awaits the
signer's confirmation. When the signer is a Ledger device, that
await blocks on the operator visually comparing the printed sha256
hash against the device screen and then approving on-device. A
recent blockhash is valid for ~150 slots, which is roughly 60 seconds
on mainnet.
A careful comparison plus on-device approval can plausibly exceed
that budget — particularly on first-time release rehearsals or over
a slow VNC/RDP link. The visible failure when the budget elapses is
Blockhash not found from the cluster RPC, after which the proposal
must be re-composed from scratch (the SDK has already burned the
transaction index against the proposal pubkey it printed).
Operationally: do the hash comparison before the prompt appears (the
printed sha256 lands on stderr before the device confirmation
prompt; the operator can pre-verify and then click), and budget
re-runs into the rehearsal schedule. The downstream
bin/program-deploy recovery procedure for partial-execution states
covers this case explicitly under DEV.md > Partial-execution recovery.
The release tooling pins exact (no caret) versions of three official
@ledgerhq/* packages:
@ledgerhq/hw-app-solana@7.10.2@ledgerhq/hw-transport-node-hid@6.33.2@ledgerhq/hw-transport@6.35.2
All three originate from the LedgerHQ/ledger-live monorepo, are
Apache-2.0 licensed, and are not deprecated. The hw-transport-node-hid
pin transitively pins node-hid@2.1.2, which fetches a prebuilt
native binary at install time. Prebuilts exist for darwin-arm64,
darwin-x64, and linux-x64 (glibc, N-API v3); other targets fall
back to a node-gyp rebuild that requires Python and a C++ toolchain.
Bun (the runtime this repository uses) implements N-API and loads
node-hid's native binding correctly in the current Bun version
recorded in bun.lock. If a future Bun upgrade regresses native-
binding resolution for node-hid, the documented fallback is to run
the affected CLI under Node directly:
cd scripts
npm install # forces prebuild-install under Node
node --experimental-vm-modules <script>
The @ledgerhq/* JavaScript is plain Node-compatible code; the only
Bun-specific risk is in node-hid's .node binary resolution. Do not
substitute community node-hid forks (node-hid-ng, etc.) — they are
not Ledger-supported and the pin is intentional.
Linux requires a udev rule granting the operator user read/write
access to the Ledger HID interface. Drop this file at
/etc/udev/rules.d/20-ledger.rules (the exact rules ship with Ledger
Live; this is one example):
SUBSYSTEMS=="usb", ATTRS{idVendor}=="2c97", MODE="0660", \
TAG+="uaccess", TAG+="udev-acl"
Then reload udev (sudo udevadm control --reload-rules && sudo udevadm trigger) and replug the device. macOS and Windows have no
equivalent step.