Skip to content

feat(sv2): Stratum V2 pool support#81

Draft
mcharles-square wants to merge 32 commits intomainfrom
feat/sv2
Draft

feat(sv2): Stratum V2 pool support#81
mcharles-square wants to merge 32 commits intomainfrom
feat/sv2

Conversation

@mcharles-square
Copy link
Copy Markdown
Collaborator

@mcharles-square mcharles-square commented Apr 24, 2026

Summary

Implements Stratum V2 support across proto-fleet per docs/stratum-v2-plan.md. Operators can create SV2 pools and assign them to a mixed fleet; native-SV2 miners connect direct, SV1 miners reach SV2 pools through a bundled SRI translator proxy with URL rewriting handled server-side at command build time. Pool Job Declaration stays reserved for v2.

What's in this PR

  1. Proto + SDK + DB foundationsPoolProtocol enum, proto3 presence on UpdatePoolRequest, typed SlotWarning / DeviceWarning enums, UpdateMiningPoolsMismatch detail, StratumV2SupportStatus on DeviceMetrics, migration 000037_add_protocol_to_pool, regenerated TS/Go/Python output.
  2. SV2 support packageserver/internal/domain/sv2/: typed Config, TCP-dial probe, full Noise NX handshake probe (github.com/flynn/noise), tproxy TCP health monitor.
  3. URL rewriter, preflight, per-device queue payloads — pure PoolURLsForDevice, shared preflight package consumed by the commit path, new EnqueuePerDevice queue variant so per-device URLs are baked into each queue row at commit time. Dispatch never re-evaluates.
  4. StratumV2 config + deployment — Kong config block (under stratum-v2- prefix), main wiring, profile-gated sv2-tproxy Compose service with pinned SRI translator image (digest-pinned), installer prompt, deployment-files/sv2/tproxy.toml template, live Docker-stack E2E test suite.
  5. Plugin rollout + UI + design doc — antminer / asicrs / proto / virtual plugin updates, pool form and settings page changes, PoolSelectionPage renders the FAILED_PRECONDITION typed mismatch from the commit response, and the full design plan.

Key design points worth reviewer attention

  • URL scheme is the single source of truth. An earlier iteration had protocol as a separate client-writable field on PoolConfig / UpdatePoolRequest / RawPoolInfo. Older clients sent mismatched (url, protocol) pairs that CEL rejected as "url scheme must match protocol". Removed the client-writable protocol field entirely — rewriter.ProtocolFromURL is the canonical derivation, and the DB column is written from the URL. The read-side Pool message keeps the protocol field so the UI can render a chip without re-parsing.
  • Canonical scheme is stratum2+tcp://, matching Braiins Pool's operator-facing docs (stratum2+tcp://HOST:PORT/AUTHORITY_PUBKEY). The CEL regex accepts the pubkey path suffix and rejects bare URLs on saved pools (CRUD); ValidatePool accepts bare URLs for the connectivity-only check path.
  • No preview RPC — commit-time preflight is the only gate. An earlier iteration on this branch added PreviewMiningPoolAssignment so the UI could disable Save before click. We dropped it: the same shared preflight runs server-side inside UpdateMiningPools and any mismatch surfaces synchronously as FAILED_PRECONDITION carrying the typed UpdateMiningPoolsMismatch detail. Removing the preview eliminates an unbounded RPC surface and removes preview/commit drift by construction. SV1 pool assignment never had a preview either, so the new SV2 path matches existing UX.
  • Preflight runs once per request against a consistent capability snapshot. The queue gained EnqueuePerDevice(map[int64][]byte) so commit writes per-device resolved URLs into each queue row; dispatch unmarshals and pushes straight to the plugin.
  • Noise handshake probe uses Noise_NX_25519_ChaChaPoly_BLAKE2s matching SRI 1.x. The initiator doesn't pre-load the pool key (NX delivers it over the wire); operator-supplied pubkey is compared against hs.PeerStatic() after the server presents its static key. Mismatch is a classic pinning failure with a specific error.
  • protoOS is SV1-only today. The current miner-firmware (crates/mcdd) uses the sv1_api Rust crate and has no SV2 wire code — PoolProtocol::StratumV2 exists in its RPC enum but only for reporting the URL scheme back. In practice every protoOS miner takes the tProxy path for SV2 pools in v1; the Proto plugin's capability probe is forward-looking and flips to Supported when firmware support lands.

Testing

  • just _lint-protos — clean (pre-existing collection/device_set warnings unrelated).
  • Server: go vet ./... and go build ./... — clean. Unit tests: go test ./internal/domain/pools/... ./internal/domain/sv2/ ./internal/domain/plugins/ ./sdk/v1/ — all pass.
  • Rust: cargo check in plugin/asicrs — clean.
  • Client: npx tsc --noEmit -p tsconfig.json and npm run lint — clean.
  • Live: just test-e2e-sv2 stands the stack with the sv2 profile, probes sv2-tproxy:34255, runs the Connect-RPC suite against fleet-api. SV2-on-SV1-only-device commit returns FAILED_PRECONDITION with the typed mismatch detail.
  • Manual: added a stratum2+tcp://v2.stratum.braiins.com:3336 pool through the UI, hit Test Connection, got reachable result against real Braiins.

Scoped out / follow-ups

  • Pubkey-path parsing: the server accepts stratum2+tcp://HOST:PORT/<pubkey> in the CEL regex but doesn't extract the pubkey from the path to feed the handshake probe automatically. Follow-up can collapse the separate noise_public_key field into URL parsing.
  • server/e2e/plugin_integration_test.go references generated-proto symbols renamed upstream (pre-existing, unrelated). SV2 E2E lives in its own server/e2e/sv2/ subpackage so it compiles regardless.
  • Multi-proxy topology, bundled Bitcoin Core, Fleet-native SV2 in protoOS — all explicitly v3+ in the plan.

Test plan

  • CI green (protos, lint, Go tests).
  • Fresh install: just rebuild-all + run just test-e2e-sv2 with COMPOSE_PROFILES=sv2.
  • Upgrade path: deploy against an existing fleet, verify existing SV1 pool rows read back with protocol=sv1 (migration 000037 default) and still dispatch correctly.
  • Smoke-test real Braiins: add stratum2+tcp://v2.stratum.braiins.com:3336 pool through UI, test-connection succeeds.
  • Manual UX: create SV1 pool, create SV2 pool, attempt SV2 assignment against a mixed fleet, verify the FAILED_PRECONDITION mismatch surfaces typed warnings on PoolSelectionPage.

Copilot AI review requested due to automatic review settings April 24, 2026 18:48
@mcharles-square mcharles-square requested a review from a team as a code owner April 24, 2026 18:48
@github-actions github-actions Bot added documentation Improvements or additions to documentation dependencies Pull requests that update a dependency file automation javascript Pull requests that update javascript code client server shared labels Apr 24, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds end-to-end Stratum V2 (SV2) pool support across the fleet stack (proto + server + plugins + UI), using URL scheme as the source of truth and enforcing commit/preview parity via a shared preflight+rewriter.

Changes:

  • Introduces pool protocol derivation + persistence (protocol column) and threads protocol through SDK/protos/plugins.
  • Adds shared pool URL rewriter + preflight, plus a new preview RPC and UI wiring to block doomed commits.
  • Adds deployment/runtime wiring for the bundled SV2 translator proxy (compose service, installer prompt, TCP health monitor) and SV2 validation probes (TCP dial / Noise NX handshake).

Reviewed changes

Copilot reviewed 69 out of 143 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
server/sqlc/queries/pool.sql Adds protocol to pool create/update queries.
server/sqlc/queries/device.sql Adds GetDeviceIdentifiersByIDs query for SV2 preflight/queue keying.
server/sdk/v1/python/proto_fleet_sdk/generated/pb/driver_pb2.pyi Generated python typings updated with protocol/SV2 support enums and fields.
server/sdk/v1/plugin.go Threads pool protocol and StratumV2Support through SDK<->protobuf conversions.
server/sdk/v1/pb/driver.proto Adds protocol + StratumV2Support enums/fields to the plugin-facing driver proto.
server/sdk/v1/interface.go Adds SDK enums and fields for pool protocol + per-scrape SV2 support reporting.
server/migrations/000037_add_protocol_to_pool.up.sql Adds pool.protocol column with default + CHECK constraint.
server/migrations/000037_add_protocol_to_pool.down.sql Drops pool.protocol on rollback.
server/internal/infrastructure/queue/service.go Adds EnqueuePerDevice to persist per-device resolved payloads at commit time.
server/internal/infrastructure/queue/mocks/mock_message_queue.go Updates mock to include EnqueuePerDevice.
server/internal/infrastructure/queue/interface.go Extends queue interface with EnqueuePerDevice contract.
server/internal/handlers/pools/handler.go Updates ValidatePool to return typed reachability/mode/credentials info.
server/internal/handlers/command/handler.go Adds PreviewMiningPoolAssignment RPC handler.
server/internal/domain/telemetry/models/v2/device_metrics.go Adds server-side StratumV2SupportStatus in telemetry model.
server/internal/domain/sv2/probe.go Implements SV2/SV1 stratum URL TCP dial probe + URL parsing helpers.
server/internal/domain/sv2/probe_test.go Tests TCP dial probe behavior and scheme acceptance.
server/internal/domain/sv2/health_monitor.go Adds background TCP health monitor for bundled translator proxy.
server/internal/domain/sv2/health_monitor_test.go Tests health monitor state transitions/cancellation behavior.
server/internal/domain/sv2/handshake.go Implements Noise NX handshake probe with key pinning check.
server/internal/domain/sv2/handshake_test.go Tests handshake probe success/failure conditions.
server/internal/domain/sv2/config.go Adds Kong-parsed SV2 proxy config and validation + rewriter projection.
server/internal/domain/sv2/config_test.go Tests SV2 config validation and projection.
server/internal/domain/stores/sqlstores/pool.go Derives/stores protocol from URL; adopts proto3 explicit presence on pool updates.
server/internal/domain/pools/service.go Adds typed ValidationResult and protocol-driven SV1 auth vs SV2 probe selection.
server/internal/domain/pools/service_test.go Updates tests for proto3 explicit presence fields on UpdatePoolRequest.
server/internal/domain/pools/rewriter/rewriter.go New pure URL rewriter + capability merge helper and invariant enforcement.
server/internal/domain/pools/rewriter/protocol.go Derives protocol from URL scheme (canonical source of truth).
server/internal/domain/pools/rewriter/protocol_test.go Tests URL->protocol derivation behavior.
server/internal/domain/pools/preflight/preflight_test.go Tests preflight output, warnings, mismatches, and input validation.
server/internal/domain/pools/preflight/scenarios_test.go Scenario tests matching SV2 plan “step 16” cases to ensure parity.
server/internal/domain/plugins/plugin_miner.go Maps pool protocol between internal pools proto and plugin SDK types.
server/internal/domain/plugins/mappers/sdk_mapper.go Maps SDK StratumV2SupportStatus into server telemetry model safely.
server/internal/domain/miner/interfaces/miner.go Adds protocol field to MinerConfiguredPool interface model.
server/internal/domain/miner/dto/command_dto.go Adds protocol to queued mining pool payload DTO.
server/internal/domain/command/reaper_integration_test.go Updates noop queue stub to satisfy new interface.
server/internal/domain/command/execution_service.go Preserves configured-pool protocol when building queued payloads.
server/go.mod Adds github.com/flynn/noise dependency for Noise handshake probing.
server/go.sum Updates module sums (includes noise).
server/generated/sqlc/pool.sql.go Generated sqlc code updated for pool protocol column.
server/generated/sqlc/models.go Generated sqlc Pool model includes Protocol.
server/generated/sqlc/device.sql.go Generated sqlc code for GetDeviceIdentifiersByIDs.
server/generated/sqlc/db.go Generated sqlc prepared statement wiring for new query.
server/generated/grpc/telemetry/v1/telemetryv1connect/telemetry.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/telemetry/v1/telemetry.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/schedule/v1/schedulev1connect/schedule.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/schedule/v1/schedule.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/pools/v1/poolsv1connect/pools.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/ping/v1/pingv1connect/ping.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/ping/v1/ping.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/pairing/v1/pairingv1connect/pairing.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/pairing/v1/pairing.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/onboarding/v1/onboardingv1connect/onboarding.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/onboarding/v1/onboarding.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/networkinfo/v1/networkinfov1connect/networkinfo.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/networkinfo/v1/networkinfo.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/minercommand/v1/minercommandv1connect/command.connect.go Generated connect stubs updated for new PreviewMiningPoolAssignment RPC.
server/generated/grpc/foremanimport/v1/foremanimportv1connect/foremanimport.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/foremanimport/v1/foremanimport.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/fleetperformance/v1/fleetperformancev1connect/fleetperformance.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/fleetperformance/v1/fleetperformance.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/fleetmanagement/v1/fleetmanagementv1connect/fleetmanagement.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/fleetmanagement/v1/fleetmanagement.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/errors/v1/errorsv1connect/errors.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/errors/v1/errors.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/device_set/v1/device_setv1connect/device_set.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/device_set/v1/device_set.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/common/v1/sort.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/common/v1/measurement.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/common/v1/device_selector.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/common/v1/cooling.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/common/v1/common.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/collection/v1/collectionv1connect/collection.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/collection/v1/collection.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/capabilities/v1/capabilities.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/buf/validate/validate.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/auth/v1/authv1connect/auth.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/auth/v1/auth.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/apikey/v1/apikeyv1connect/apikey.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/apikey/v1/apikey.pb.go Generated protobuf updated (import ordering).
server/generated/grpc/activity/v1/activityv1connect/activity.connect.go Generated connect stubs updated (import ordering).
server/generated/grpc/activity/v1/activity.pb.go Generated protobuf updated (import ordering).
server/docker-compose.yaml Adds SV2 proxy env wiring for dev compose stack.
server/docker-compose.base.yaml Adds profile-gated sv2-tproxy service definition and healthcheck.
server/cmd/fleetd/main.go Wires SV2 config validation, preflight resolvers, and health monitor startup.
server/cmd/fleetd/config.go Adds StratumV2 config block to fleetd config.
proto/pools/v1/pools.proto Adds PoolProtocol enum/field, removes client-writable protocol fields, adds validation + ValidatePool response fields.
proto/minercommand/v1/command.proto Adds preview/preflight proto vocabulary + PreviewMiningPoolAssignment RPC.
plugin/virtual/pkg/virtual/simulator.go Virtual plugin reports SV2 support in metrics based on config toggle.
plugin/virtual/pkg/virtual/simulator_test.go Tests virtual miner SV2 support reporting.
plugin/virtual/internal/config/config.go Adds config toggle stratum_v2_supported.
plugin/virtual/go.mod Adds test deps for new virtual plugin tests.
plugin/virtual/go.sum Updates sums for added deps.
plugin/proto/pkg/proto/client.go Adds system snapshot fetch and SV2 support inference from ProtoOS response.
plugin/proto/internal/device/device.go Uses snapshot to populate firmware version + SV2 support with throttling/caching.
plugin/asicrs/src/device.rs Reports StratumV2SupportStatus based on firmware variant; preserves configured-pool protocol field.
plugin/antminer/internal/device/device.go Sets deterministic SV2 unsupported status for stock Bitmain firmware.
justfile Adds test-e2e-sv2 helper to stand up sv2 compose profile and run e2e tests.
deployment-files/sv2/tproxy.toml Adds installer-rendered tProxy TOML template.
deployment-files/sv2/README.md Documents bundled tProxy service and operational notes.
deployment-files/install.sh Adds optional SV2 proxy installer prompt and tProxy TOML rendering.
deployment-files/docker-compose.yaml Adds sv2 proxy env wiring and profile-gated service for installs.
client/src/shared/components/MiningPools/PoolModal.tsx Adds client-side URL scheme validation and URL tooltip guidance.
client/src/shared/components/MiningPools/PoolForm/constants.ts Adds URL scheme validator matching server-accepted prefixes.
client/src/shared/components/MiningPools/PoolForm/PoolForm.tsx Integrates URL scheme validation + updated tooltip text.
client/src/protoFleet/features/settings/components/MiningPools.tsx Notes/aligns with patch-shaped UpdatePoolRequest semantics in UI code.
client/src/protoFleet/features/onboarding/components/CompleteSetup/CompleteSetup.test.tsx Updates mock surface to include previewMiningPoolAssignment.
client/src/protoFleet/features/fleetManagement/components/ActionBar/SettingsWidget/PoolSelectionPage/usePoolAssignmentPreview.ts New hook calling PreviewMiningPoolAssignment and computing mismatch state.
client/src/protoFleet/features/fleetManagement/components/ActionBar/SettingsWidget/PoolSelectionPage/PoolSelectionPage.tsx Calls preview hook, disables Save, shows mismatch callout, shares slot mapping.
client/src/protoFleet/api/usePools.ts Adds optional Noise pubkey to ValidatePool request surface.
client/src/protoFleet/api/useMinerCommand.ts Adds PreviewMiningPoolAssignment client method.
client/src/protoFleet/api/generated/ping/v1/ping_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/networkinfo/v1/networkinfo_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/foremanimport/v1/foremanimport_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/common/v1/sort_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/common/v1/measurement_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/common/v1/device_selector_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/common/v1/cooling_pb.ts Generated TS updated (formatting/layout).
client/src/protoFleet/api/generated/common/v1/common_pb.ts Generated TS updated (formatting/layout).

Comment thread server/internal/handlers/pools/handler.go
Comment thread deployment-files/install.sh Outdated
Comment thread server/internal/domain/sv2/config.go
Comment thread deployment-files/sv2/README.md Outdated
Comment thread client/src/shared/components/MiningPools/PoolModal.tsx Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 585b02cba0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread server/cmd/fleetd/main.go Outdated
Comment thread server/internal/domain/pools/service.go
Comment thread server/internal/domain/pools/rewriter/rewriter.go
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4351178d6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread server/internal/domain/command/service.go
Comment thread deployment-files/install.sh Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

🔐 Codex Security Review

Note: This is an automated security-focused code review generated by Codex.
It should be used as a supplementary check alongside human review.
False positives are possible - use your judgment.

Scope summary

  • Reviewed pull request diff only (8b5dc3e74f5411140a8a68799d6269f9e56bfb0c...4ac5d1fbe46e00220a8e9ca22602d1c08c1d9ace, exact PR three-dot diff)
  • Model: gpt-5.4

💡 Click "edited" above to see previous reviews for this PR.


Review Summary

Overall Risk: HIGH

Findings

[HIGH] Startup race lets queued pool updates bypass the new SV2 proxy health gate

  • Category: Reliability
  • Location: server/cmd/fleetd/main.go:330
  • Description: executionService.Start() is invoked before the new Stratum V2 config is validated, before the health monitor is started, and before executionService.SetStratumV2Resolvers() wires in the proxy URL and health checker. Because the queue is persisted, pending UpdateMiningPools rows can be dequeued during startup while sv2Health is still nil, and proxyUnreachableForPayload() then returns false instead of failing closed.
  • Impact: A restart with queued proxied pool updates can still push miners at the translator URL even when the proxy is down, misconfigured, or intentionally disabled, taking those miners off-pool.
  • Recommendation: Validate Stratum V2 config and wire the health gate into executionService before starting the queue processor, or block dequeueing until that wiring is complete.

[MEDIUM] Proxy-down dispatch is marked as a permanent failure instead of a retriable queue error

  • Category: Reliability
  • Location: server/internal/domain/command/execution_service.go:403
  • Description: The new proxy-health branch returns fleeterror.NewFailedPreconditionErrorf(...) when the bundled translator is down (executeCommandOnDevice, lines 462-474), but markQueueMessageStatus() classifies every FailedPrecondition as permanently failed. That contradicts the comment immediately above the new branch, which says this should be treated as transient.
  • Impact: A short translator outage causes only the proxied devices in a batch to fail once and never retry, leaving the fleet partially updated until an operator manually resubmits the pool change.
  • Recommendation: Return a retriable error class for proxy-unreachable conditions, or special-case this path in markQueueMessageStatus() so it uses the normal retry flow.

[MEDIUM] Unknown preflight errors are swallowed and can turn into broken per-device payloads

  • Category: Reliability
  • Location: server/internal/domain/pools/preflight/preflight.go:230
  • Description: perSlotResolution() maps any unexpected PoolURLsForDevice() error to SLOT_WARNING_UNSPECIFIED. Run() only sets HasMismatch for non-UNSPECIFIED warnings, so the batch proceeds as if preflight succeeded. Downstream, the per-device marshal path overwrites slot URLs from EffectiveURL without guarding against it being empty.
  • Impact: A corrupt protocol value, bad proxy URL, or any other unexpected rewriter failure can become a successful RPC that enqueues malformed pool payloads instead of a synchronous rejection, potentially blanking pool URLs on target miners.
  • Recommendation: Treat unknown rewriter errors as hard failures, or at minimum convert them into a concrete mismatch that forces HasMismatch = true; also reject empty EffectiveURL during per-device marshal.

Notes

  • Review scope was limited to .git/codex-review.diff.
  • I did not find new authz or SQL-injection issues in the changed hunks; the meaningful risk is concentrated in the new SV2 proxy/pool-rewrite execution path.

Generated by Codex Security Review |
Triggered by: @mcharles-square |
Review workflow run

@mcharles-square mcharles-square marked this pull request as draft April 24, 2026 20:00
@mcharles-square mcharles-square changed the title feat(sv2): Stratum V2 pool support end-to-end feat(sv2): Stratum V2 pool support Apr 25, 2026
Introduce the type system for SV2:

- PoolProtocol enum (UNSPECIFIED/SV1/SV2) and CEL validation in
  pools.proto; protocol is not written by clients and is derived from
  the URL scheme server-side.
- UpdatePoolRequest migrated to proto3 explicit presence on all patch
  fields.
- PreviewMiningPoolAssignment RPC on MinerCommandService with typed
  enums for RewriteReason / SlotWarning / DeviceWarning; matching
  UpdateMiningPoolsMismatch detail on FAILED_PRECONDITION commits.
- ValidatePoolResponse gains reachable / credentials_verified / mode
  so the UI can distinguish "reachable but credentials unverified".
- CapabilityStratumV2Native constant and StratumV2SupportStatus on
  DeviceMetrics so plugins can report SV2 support per telemetry scrape.
- Migration 000037 adds pool.protocol column with sv1 default and a
  CHECK constraint for the two live values.

Generated code (Go, TS, Python) regenerated from the proto updates.
…nitor

New server/internal/domain/sv2/ package:

- config.go: typed config (ProxyEnabled, MinerURL, UpstreamURL,
  HealthCheckAddr, HealthInterval) + validation.
- probe.go: SV2 validation helper that does a TCP dial (v1 default) or
  a full Noise NX handshake probe (v1.5 fast-follow) and returns the
  three-field ValidatePoolResponse shape.
- handshake.go: minimal Noise_NX_25519_ChaChaPoly_BLAKE2s client that
  proves the upstream endpoint speaks SV2 without a full SetupConnection
  roundtrip; used by probe.go when the mode is SV2_HANDSHAKE.
- health_monitor.go: long-running TCP probe of the bundled tproxy with
  activity-log transitions on up/down flips.

Adds github.com/flynn/noise as a server module dependency.
Core server-side logic for SV2 pool assignment:

- pools/rewriter: pure PoolURLsForDevice resolving (pool, capability,
  proxy) → effective URL. Protocol is derived from URL scheme via
  rewriter.ProtocolFromURL; the rewriter subpackage owns scheme→
  protocol mapping so higher layers don't import pools.
- pools/preflight: typed input/output package called by both Preview
  and commit paths. Rejects multi-SV2-slot-per-device with
  DEVICE_WARNING_MULTIPLE_SV2_SLOTS_PROXIED and SV2-without-proxy with
  SLOT_WARNING_SV2_NOT_SUPPORTED; enqueues per-device payloads.
- queue.EnqueuePerDevice: new variant alongside Enqueue; writes
  distinct payload bytes per device row. queue_message.payload column
  unchanged — storage contract is identical.
- execution_service.go drops the dispatch-time rewriter entirely.
  Dispatch unmarshals the per-device payload and pushes to the plugin;
  preview/commit produce identical URLs by construction.
- DTOs and reapply path carry Protocol through MiningPool and
  MinerConfiguredPool so worker-name reapply does not drop SV2 intent.
- Pools service.go derives protocol from URL scheme on write; handler
  migrates to proto3 presence-based patching.
- sqlstores.pool.go derives the DB protocol column from the URL.
- Handlers expose PreviewMiningPoolAssignment and the updated
  patch semantics for UpdatePool.
… E2E test

Deployment + ops plumbing for SV2:

- cmd/fleetd/config.go gains StratumV2Config with ProxyEnabled,
  ProxyMinerURL, ProxyUpstreamURL, ProxyHealthCheckAddr,
  ProxyHealthInterval under the stratum-v2- prefix. Startup validation
  rejects empty MinerURL / UpstreamURL when ProxyEnabled=true.
- main.go wires the health monitor and validation probe.
- docker-compose: new sv2-tproxy service under the "sv2" profile with
  pinned SRI translator image, health probe, and mounted TOML config.
  STRATUM_V2_* env vars are forwarded into fleet-api; dev default
  targets the sv2-tproxy bridge hostname, prod uses 127.0.0.1 (host
  networking).
- deployment-files/sv2/: default tproxy.toml + operator README.
- install.sh prompts for SV2 enablement, writes .env, and templates
  tproxy.toml.
- justfile target for spinning up the SV2 stack.
- server/e2e/sv2/sv2_test.go: live Docker-stack E2E exercising assign
  → dispatch → telemetry across the mixed-protocol paths and the
  capability-mismatch rejection.
Consumer-facing SV2 changes:

Plugins — each reports StratumV2SupportStatus on its telemetry
snapshot based on what it can probe from live firmware:
- antminer: firmware-identifier inspection (Braiins OS → Supported,
  stock → Unsupported, unknown firmware → Unknown).
- asicrs: mirrors what asic-rs reports for the connected device;
  defaults to Unknown.
- proto: probes the ProtoOS HTTP API each telemetry cycle. Today's
  protoOS returns Unsupported (firmware is SV1-only via sv1_api);
  wiring is in place for when native SV2 ships.
- virtual: toggled via the simulator config so integration tests can
  exercise both direct and proxied paths.

UI:
- PoolForm/PoolModal: URL scheme tooltip and single-field
  validateURLScheme mirroring the server CEL rule. Protocol selector
  removed — the scheme alone determines protocol.
- Settings MiningPools page: drops the client-side protocol
  assignment; all create/update flows send the URL only.
- PoolSelectionPage: calls PreviewMiningPoolAssignment and branches on
  typed RewriteReason / SlotWarning / DeviceWarning enum values to
  surface capability mismatches before Save.

Docs:
- docs/stratum-v2-plan.md: 16-step design plan with rationale,
  non-goals, package layout, roadmap, and known limitations. Notes
  explicitly that current protoOS firmware is SV1-only, so every
  protoOS miner takes the tProxy path in v1.
CI:
- gofmt the eight files that failed `golangci-lint fmt --diff` (struct
  field alignment in service.go, dto.go, preflight, rewriter, etc.).
- Regenerate proto outputs with the hermit-pinned protoc-gen-go v1.36.5
  so the generated-code-check no longer flags version-string drift.
- Fix client lint: alphabetize one import in useMinerCommand.ts, format
  three lines in usePools.ts and one in usePoolAssignmentPreview.ts,
  drop the early-return setState branch in usePoolAssignmentPreview's
  effect (the rule wants derived state, not in-effect resets).

P1 review fixes:
- handler.ValidatePool now forwards FleetError as-is (preserves bad-URL
  scheme as InvalidArgument, malformed Noise key as InvalidArgument)
  and uses Unavailable for probe failures and unreachable pools instead
  of lumping everything into PermissionDenied.
- pools.ValidateConnection rejects a non-empty Noise key whose length
  isn't 32 with InvalidArgument rather than silently downgrading to a
  TCP dial — the operator asked for handshake pinning, returning
  "connected" without it would be a security regression.
- main.go wires NewTelemetrySV2Resolver instead of nil so the SV2
  capability resolver actually pulls each device's StratumV2Support
  from the latest telemetry scrape. Without this the rewriter treats
  every miner as SV1-only and the native-SV2 path never fires.
- install.sh swaps the broken sed-based `(tcp\|ssl)` regex for a bash
  =~ regex over the upstream URL; a malformed URL now fails the match
  outright instead of letting the original string flow into the TOML.

Other review fixes:
- sv2/config.go help text for ProxyUpstreamURL says stratum2+(tcp|ssl)
  rather than the invented "sv2+*" scheme.
- deployment-files/sv2/README.md reflects the Noise NX handshake probe
  behavior (pinning when a 32-byte authority pubkey is supplied; TCP
  dial otherwise), removing the stale "v1.5 fast-follow" note.
- PoolModal.onSubmit calls validateURLScheme once and stores the
  result instead of running it twice.

P2 review fixes:
- rewriter.PoolURLsForDevice derives ResolvedSlot.Protocol from the
  rewritten URL (via ProtocolFromURL), so a slot rewritten to the SV1
  proxy URL no longer reports protocol=SV2 to downstream surfaces.
- mismatchesToFailedPrecondition attaches each preflight.Mismatch as
  an UpdateMiningPoolsMismatch proto detail on the FAILED_PRECONDITION
  ConnectError, so clients can branch per-device/per-slot without
  parsing the summary string.
- Renumber Stratum V2 migration from 000037 to 000038. Origin/main has
  taken 000037 (`add_device_snapshot_to_command_on_device_log`); the
  duplicate filename made fleet-api fail to start in E2E with
  "duplicate migration file: 000037_add_protocol_to_pool.down.sql".
- exhaustive lint: add missing Unspecified cases to switches in
  service.go (slotsMatchTemplate, marshalPerDevicePayload), preflight
  (protoSlot, protoRewriteReason), rewriter (resolveSingle, PoolSlot
  String, RewriteReason String), pools/service.go (ValidateConnection),
  plugins/plugin_miner.go (pool protocol both directions), and
  plugins/mappers/sdk_mapper.go (StratumV2Support map).
- gosec G115 / wrapcheck: probe.go wraps DialContext + Close errors;
  handshake.go's writeNoiseFrame wraps both Write calls and notes the
  uint16 conversion is bounded by the 0xFFFF guard above; readNoiseFrame
  wraps both ReadFull calls.
- mapsloop: rewriter.MergeCapabilities uses maps.Copy for the static
  and model overlays.
Addresses Codex security review findings on top of CI-green branch.

[HIGH] Reject SV2-pool-with-mismatched-proxy-upstream at preflight.
- rewriter.ProxyConfig gains UpstreamURL; sv2.Config.RewriterConfig
  forwards ProxyUpstreamURL into it.
- resolveSingle compares the slot's pool URL to the configured upstream
  via sameStratumURL (host:port equivalence after stripping the optional
  /AUTHORITY_PUBKEY suffix); on mismatch returns ErrProxyUpstreamMismatch.
- Without this, pointing the rewriter at a different SV2 pool than the
  bundled tProxy is configured for would silently route every miner's
  hashrate to whatever the proxy upstream actually is — a hashrate-
  redirection vector the v1 single-proxy topology must surface up front.
- New SLOT_WARNING_PROXY_UPSTREAM_MISMATCH proto enum value plus
  preflight mapping; tests updated to set UpstreamURL=poolURL on
  paths that should successfully proxy.

[MED] UI distinguishes "reachable but credentials unverified" from
"connection successful".
- usePools.validatePool plumbs the typed
  reachable / credentials_verified / mode response into a
  ValidatePoolOutcome and forwards it through onSuccess.
- PoolModal renders three callouts now: success (verified),
  warning (reachable but credentials unverified, with mode-specific
  text for SV2_TCP_DIAL vs SV2_HANDSHAKE), and danger (probe failed).
- Existing protoOS BackupPoolModalWrapper wires through useTestConnection,
  which synthesises a fully-verified outcome (the protoOS endpoint runs
  a real SV1 subscribe+authorize, so 200 means authenticated).

[MED] Restrict pool URL schemes to plain TCP only.
- CEL rules across PoolConfig / UpdatePoolRequest / ValidatePoolRequest
  now reject stratum+ssl, stratum+ws, and stratum2+ssl. Port is now
  mandatory in all schemes (the dispatch path requires host:port).
- rewriter.ProtocolFromURL and sqlstores.dbProtocolFromURL aligned with
  the new whitelist.
- Client validateURLScheme accepts only stratum+tcp:// and
  stratum2+tcp://; tooltips and error messages updated.
- Tests in protocol_test.go cover both directions of the change.

[MED] Installer renders downstream_port from STRATUM_V2_PROXY_MINER_URL.
- Previously the installer prompted for a custom miner-facing port but
  silently kept downstream_port=34255 in the TOML, making any non-default
  port a dead listener.
- The same bash regex used for the upstream URL parses the miner URL
  port and rewrites both downstream_port in tproxy.toml and
  STRATUM_V2_PROXY_HEALTH_ADDR in .env.

Plan doc: URL scheme section updated to v1-TCP-only with rationale and
the v1.5 follow-up note for TLS support.
…eme-strict config

Addresses second-round Codex security review.

[HIGH] Proxy upstream comparison must include the authority pubkey.
- Pre-fix sameStratumURL() truncated the URL path before comparing,
  treating stratum2+tcp://pool:34254/PUB_A and ...:34254/PUB_B as
  equivalent — exactly the case codex flagged. Both pubkeys live behind
  the same TCP endpoint but identify different SV2 pools, so a proxy
  pinned to PUB_A would silently accept routing for PUB_B miners.
- canonicaliseStratumURL now lowercases scheme + host but preserves the
  case-significant /AUTHORITY_PUBKEY suffix verbatim. Asymmetric paths
  (one side has /KEY, the other doesn't) compare as different.
- New tests cover the matching-host:port + different-pubkey case, the
  asymmetric-pubkey case, and the host-case-insensitivity invariant.

[MED] Migration backfills protocol from URL.
- 000038_add_protocol_to_pool.up.sql kept the DEFAULT 'sv1' for fresh
  inserts but pre-existing pool rows whose URL is stratum2+tcp://
  would have been silently coerced to SV1 after upgrade — making the
  command path skip SV2 preflight for those rows.
- Backfill UPDATE … WHERE LOWER(url) LIKE 'stratum2+tcp://%' so the
  derived state matches the URL the operator persisted.

[MED] Reject SSL/WS schemes in StratumV2 config + installer.
- sv2.Config.Validate() now requires ProxyMinerURL to start with
  stratum+tcp:// and ProxyUpstreamURL to start with stratum2+tcp://;
  SSL/WS variants get rejected at startup rather than being silently
  written to compose and failing once dispatch tries to use them.
- Installer regex for the miner URL tightened to stratum+tcp only;
  the warning message now calls out the v1 plain-TCP scope.
- Tests: TestConfig_ValidateRejectsUnsupportedSchemes covers each
  rejected scheme combination.

Cleanup: rewriter.canonicaliseStratumURL uses strings.Cut throughout
to satisfy the stringscut linter and shed an extra Index call.
…SV1 auth-fail from unreachable

Third-round Codex security review.

[HIGH] Custom downstream port now actually published by Docker.
- The previous fix wrote a custom downstream_port to tproxy.toml and to
  STRATUM_V2_PROXY_HEALTH_ADDR, but compose still hardcoded the host
  port mapping to 34255:34255. Miners hit a port the host wasn't
  forwarding and Fleet reported the proxy as down.
- sv2-tproxy ports + healthcheck now substitute
  ${STRATUM_V2_PROXY_DOWNSTREAM_PORT:-34255}; installer writes that env
  var alongside the existing tproxy.toml + health-addr edits so the
  three values stay aligned.

[MED] RawPoolInfo gets the same scheme whitelist as PoolConfig.
- RawPoolInfo had no CEL rule; combined with rewriter.MustProtocolFromURL
  (which silently coerces unknown schemes to SV1), raw pool URLs from
  unknown-pool miner state could carry a stratum2+ssl/ws/etc. URL and
  bypass the SV2 preflight entirely.
- Added the same buf.validate CEL expression PoolConfig uses; switched
  the createMiningPoolDTOFromSlotConfig raw-pool branch from
  MustProtocolFromURL to ProtocolFromURL so an unrecognised scheme
  surfaces as INVALID_ARGUMENT instead of being treated as SV1.

[MED] SV1 credential rejection no longer looks like a network outage.
- pools.ValidateConnection's SV1 branch was setting Reachable=ok, so a
  pool that returned JSON-RPC false to mining.authorize (bad creds, no
  transport error) came back as Reachable=false, which the handler then
  mapped to CodeUnavailable("pool unreachable").
- The transport actually completed in that case, so the new mapping
  is Reachable=true, CredentialsVerified=ok. Operators distinguish
  "wrong username/password" from "DNS/firewall/port" again.
…preview request sequencing

Fourth-round Codex security review.

[HIGH] Proto plugin reads SV2 support on the first telemetry cycle.
- refreshSystemSnapshot was gated by the firmware refresh throttle, so
  freshly-paired Proto miners returned StratumV2Support=Unspecified for
  the entire ~5min window and preflight treated them as SV1-only —
  rejecting valid SV2 assignments or unnecessarily routing them
  through the proxy.
- Probe now fetches on every Status() call until lastSV2Support is
  populated once; only then does the throttle kick in. A transient
  fetch failure pre-initial-read is still retried on the next call;
  post-initial-read, transient failures coast on the cached value
  until the next interval tick.

[MED] Config and installer require explicit host:port and the same
scheme set.
- sv2.Config.Validate now matches ProxyMinerURL against
  ^stratum\+tcp://host:port$ and ProxyUpstreamURL against
  ^stratum2\+tcp://host:port[/PUB]$ via regexp; values like
  stratum+tcp://proxy or stratum2+tcp://pool/PUBKEY without a port get
  rejected at startup, which would otherwise let net.Dial fail at
  dispatch with a much less actionable error.
- Installer regex tightened from stratum2+(tcp|ssl) to stratum2+tcp
  only, so install-time validation matches what the server accepts at
  startup. Warning message updated.
- Tests cover the missing-port case for both fields.

[LOW] Preview hook ignores stale async responses.
- usePoolAssignmentPreview tracks a monotonic latestRequestId; each
  scheduled RPC captures the latest value, and onSuccess/onError/finally
  callbacks bail when their captured ID is no longer current. Without
  this, a slow response from preview N could overwrite the state set
  by preview N+1, falsely re-enabling Save.
- hasMismatch now folds error and isLoading into the
  not-saveable-yet signal so a transient preview failure that left
  previews=[] doesn't flip Save back to enabled.
… E2E

- rewriter.go: extract "UNSPECIFIED" into labelUnspecified to satisfy
  the goconst lint that ran on the previous push (4 occurrences across
  PoolSlot.String and RewriteReason.String).
- Re-run prettier on the generated TS protos. The previous push only
  ran buf generate; CI's `npm run format:check` and the
  generated-code-check both expect the prettier-formatted output that
  `just gen` produces, so each round of regen needs the format pass.
- protoOS pools.spec.ts: update "save invalid pool URL" expectation
  from a server-side error toast to the inline URL validation message.
  Client-side validateURLScheme mirrors the server CEL rule, so an
  obviously-wrong URL fails fast at Save without sending the RPC —
  same "invalid pool was not saved" outcome the test ultimately
  asserts, just surfaced before the network round trip.
- server/generated: run hermit's goimports to reorder reflect/sync/unsafe
  alongside the other imports. CI's `just gen` pipeline does this in
  _format-server; my local regen was missing it.
- protoOS pools E2E: my client-side scheme validation prevents Save
  from reaching the server, leaving the modal open. The follow-up
  `navigateToHome()` step's click was being intercepted by the modal,
  not by the unsaved-changes flow it was originally testing.
- Add a closePoolModal helper (Escape) and call it after the inline
  validation assertion so the modal is gone before navigation.

(Python SDK staleness check failed on the previous run because the
hermit nfpm download hit a transient 502; not addressing here — it
should pass on this push's retry.)
- server/sdk/v1/pb/generated/driver.pb.go was carrying a stale
  `sv2+tcp://...` example comment from before the canonical-scheme
  rename; regenerated to match the proto source's `stratum2+tcp://`.
- protoOS pools page: closePoolModal uses toBeHidden() instead of
  not.toBeVisible() to satisfy the playwright/no-useless-not lint rule.
…y-username + preview redaction

Fifth-round Codex security review.

[HIGH] Upstream identity is now single-source-of-truth.
- Installer dropped the separate "Pool's Noise authority pubkey" prompt.
  The operator pastes the canonical Braiins-format URL once
  (stratum2+tcp://host:port/AUTHORITY_PUBKEY), and the installer parses
  the pubkey out for tproxy.toml. Two independent inputs created a
  class of bug where the URL Fleet's rewriter pinned for routing and
  the pubkey the proxy actually pinned could diverge silently — the
  hashrate-diversion case the rewriter's mismatch check is supposed to
  prevent.
- sv2.Config.Validate now requires the /AUTHORITY_PUBKEY suffix on
  STRATUM_V2_PROXY_UPSTREAM_URL when ProxyEnabled=true. A startup
  without the suffix fails fast with a typed error.
- Tests in config_test.go updated to use the suffixed form, plus a
  new "missing authority pubkey suffix" rejection case.

[HIGH] Proxied pool assignments now consult tProxy health.
- New ProxyHealthChecker interface on command.Service (Up + HasState,
  matching sv2.HealthMonitor's surface).
- effectiveProxyConfig() forces ProxyEnabled=false when the bundled
  translator is down or hasn't yet flipped to up. Preflight + commit
  paths route through this helper; the rewriter then rejects proxied
  routes with the existing SLOT_WARNING_SV2_NOT_SUPPORTED rather than
  pushing a dead miner-facing URL to every SV1-only miner.
- Health-unknown is treated as down (fail-closed) so the first
  preflight after startup doesn't approve routes for a proxy that
  hasn't yet been probed.
- main.go wires the existing HealthMonitor into the command service.

[MED] UpdatePool rejects explicit empty username.
- pools.Service.UpdatePool: when r.Username != nil, an empty/whitespace
  value returns INVALID_ARGUMENT immediately. The separator rule (no
  '.') still runs only on actual change so legacy pools predating the
  restriction can still be edited.
- New invalidPoolUsernameEmptyMessage constant; the empty-string check
  is at service-call-site so it doesn't change the validatePoolUsername
  signature used elsewhere.

[MED] PreviewMiningPoolAssignment added to RedactedRequestProcedures.
- The new RPC carries the same RawPoolInfo as UpdateMiningPools, so
  preview bodies could leak pool credentials at debug log level.
…r-device payload

Sixth-round Codex security review.

[HIGH] Pool-assignment device resolution is org-scoped.
- Added GetDeviceIDsByDeviceIdentifiersForOrg and
  GetDeviceIdentifiersByIDsForOrg sqlc queries that filter by org_id.
- command.Service uses the ForOrg variants on every caller-controlled
  selector path (PreviewMiningPoolAssignment, UpdateMiningPools).
  Identifiers from a foreign org are dropped at the SQL layer; the
  call site additionally cross-checks the row count and rejects the
  request with INVALID_ARGUMENT when any identifier was unknown.
- Without these checks, an authenticated caller could use the new
  preview RPC as a low-friction oracle to enumerate foreign-tenant
  device identifiers and read their SV2 capability/proxy state.
- The original unscoped GetDeviceIDsWithIdentifiers / GetDeviceIDsByDeviceIdentifiers
  / GetDeviceIdentifiersByIDs queries are kept for the existing internal
  callers (telemetry status writers process IDs they generated themselves).

[MED] marshalPerDevicePayload rewrites Protocol alongside URL.
- Preview returned the rewriter's effective protocol (SV1 for proxied
  routes), but the queue payload kept the template's source protocol
  (SV2). Drivers that branch on the protocol field would see protocol=
  SV2 alongside a stratum+tcp:// URL — exactly the parity break the
  preflight is supposed to prevent.
- Per-device payload now copies SlotResult.Protocol into the slot
  alongside SlotResult.EffectiveURL.
…ests in fleet UI, fail-fast miner URL prompt

Seventh-round Codex security review.

[MED] SV2 capability resolver overlays static caps under telemetry.
- The previous resolver passed nil into MergeCapabilities for static
  AND model layers, so a telemetry batch error or a
  Unknown/Unspecified per-device telemetry value would silently demote
  every native-SV2 miner to SV1-only — including freshly-paired Proto
  miners during the window before their first scrape lands.
- New main.go staticSV2CapsProvider looks up each device's driver via
  the existing device store and asks plugins.Service for the driver's
  static capability map (sdk.Capabilities). Resolver merges static +
  telemetry; if telemetry says Supported/Unsupported it wins,
  otherwise the static signal carries.
- ResolveCapabilities now takes orgID so the device-store lookup is
  tenant-scoped; resolveSV2Capabilities pulls it from session.
- Telemetry batch errors no longer wipe the static layer.

[MED] PoolSelectionPage distinguishes verified from reachable-only.
- The fleet-side pool tester treated every onSuccess as full success
  even though usePools.validatePool now returns the typed
  ValidatePoolOutcome. Mirrored the shared PoolModal's three-callout
  layout: success (verified), warning (reachable but credentials
  unverified), danger (probe failed). The warning callout text is
  mode-specific (SV2_TCP_DIAL vs SV2_HANDSHAKE).

Cleanup (prior review note):
- Installer's miner-URL prompt now loops on a regex match instead of
  just non-empty, matching the server's startup validator. A typo
  would otherwise let install succeed but fleetd's startup fail
  immediately on a known-bad config.
The earlier change to surface reachable-but-unverified outcomes broke
the existing PoolSelectionPage test mock, which called onSuccess() with
no arguments. The "Pool connection successful" callout now requires
credentialsVerified=true, so without an outcome the success state never
renders and the three test cases that asserted on the success callout
failed.

Updated the mock to pass a fully-verified outcome (reachable=true,
credentialsVerified=true, mode=SV1_AUTHENTICATE), matching how
useTestConnection synthesises a successful authenticated probe.

(Generated Code Check, Python tarball, and Virtual Plugin failures on
the previous run were all hermit downloads hitting GitHub 502s — they
should clear on retry.)
…or, scopeable proxy bind

Eighth-round Codex security review.

[HIGH] Installer re-renders tproxy.toml on upgrade.
- configure_stratum_v2 used to early-return if .env already had
  STRATUM_V2_PROXY_ENABLED, which skipped the only code path that
  rendered the mounted tproxy.toml. The release tarball ships a
  placeholder TOML, so a tarball-based upgrade left the proxy config
  with REPLACE_WITH_POOL_HOST and the bundled translator came up
  pointed at nothing.
- New render_sv2_tproxy_toml helper renders both upstream and
  downstream values from validated URLs. The upgrade path reads the
  saved STRATUM_V2_PROXY_UPSTREAM_URL / MINER_URL from .env and calls
  the renderer; first-install path uses it after collecting answers.
- STRATUM_V2_PROXY_DOWNSTREAM_PORT is overwritten in-place if already
  present rather than appended, so reruns don't accumulate stale
  duplicate entries.

[MED] Bulk pool preview now uses the commit-path DeviceSelector.
- usePoolAssignmentPreview takes a DeviceSelector instead of a
  []string of identifiers, so allDevices selectors are previewed
  directly. Previously the wrapper passed selectedMiners-derived
  identifiers (often empty in "all" mode), which meant the preview
  evaluated 0 or only the visible miners while the commit hit the
  full server-resolved fleet.
- PoolSelectionPage forwards the same DeviceSelector its
  PoolSelectionPageWrapper builds for UpdateMiningPools.

[MED] Translator proxy listener is scopeable per interface.
- Compose port mapping is now
  ${STRATUM_V2_PROXY_DOWNSTREAM_HOST:-0.0.0.0}:${PORT}:${PORT}
  so operators on multi-homed / internet-facing hosts can constrain
  the unauthenticated stratum listener to a private LAN IP via .env.
  Default 0.0.0.0 keeps existing single-NIC deployments working.
- README.md and the install.sh prompt explain the binding behaviour
  and how to override it.
…abort stale previews

Ninth-round Codex security review.

[MED] CEL regex accepts single-label hosts.
- `[a-zA-Z0-9][a-zA-Z0-9.-]*[a-zA-Z0-9]\.[a-zA-Z]{2,}` required a dot
  and a TLD, so `stratum+tcp://ckpool:3333`, `stratum+tcp://localhost:3333`,
  and similar local-network setups failed validation even though the
  runtime parsing handles them fine.
- Relaxed to `[a-zA-Z0-9]([a-zA-Z0-9.-]*[a-zA-Z0-9])?` across PoolConfig,
  UpdatePoolRequest, ValidatePoolRequest CEL rules and RawPoolInfo.
  Same change in the Go regex backing sv2.Config.Validate.

[MED] Translator listener defaults to the miner-URL host when it's an IP.
- Compose port mapping was already
  ${STRATUM_V2_PROXY_DOWNSTREAM_HOST:-0.0.0.0}:port:port, but the
  installer never wrote DOWNSTREAM_HOST, so opting into SV2 always
  exposed the unauthenticated listener on every interface.
- Installer now parses the host portion of STRATUM_V2_PROXY_MINER_URL.
  If it's an IPv4 / bracketed IPv6 literal, that becomes the bind
  address; for hostnames or wildcards, fall back to 0.0.0.0 (compose
  can't bind a hostname). Operators who want to scope to a private
  NIC just put the IP literal in the miner URL — no separate prompt.

[MED] Pool-assignment preview supports request abort.
- useMinerCommand.previewMiningPoolAssignment accepts an AbortSignal
  and forwards it to the Connect-RPC call.
- usePoolAssignmentPreview tracks an in-flight AbortController and
  cancels the previous request before kicking off the next one (and
  on cleanup), so server-side preflight stops doing work for previews
  the operator has already moved past. The existing latestRequestId
  guard still protects against stale callbacks; abort closes the
  matching server-side leak.

Stale doc note from prior review: deployment-files/sv2/README.md said
pool assignment ignores proxy health, but commit `f1198c2` made
preflight fail closed when the bundled translator is unhealthy.
README updated to match the actual behavior.
…come on saved-pool tests, validate health settings

Tenth-round Codex security review.

[MED] Patch updates no longer erase saved passwords.
- UpdatePoolRequest's password wrapper now follows proto3 explicit
  presence: an absent field means "leave unchanged," an empty string
  means "erase." MiningPoolsForm's onboarding bulk-save and per-pool
  save both used to send `password: pool.password` unconditionally,
  which sent `""` for unmodified existing pools and silently wiped
  the stored encrypted password.
- Both call sites now spread `password` into the request only when
  the user actually typed something (`isPasswordSet === true` for
  per-pool save; non-empty `pool.password` for the bulk-save path,
  which has no per-pool isPasswordSet flag).

[MED] Saved-pool connection tests reflect verification status.
- The settings page (handleTestConnection in MiningPools.tsx) and the
  pool selection modal (handleTestSelectedConnection) both treated
  every onSuccess as a fully-verified pass. Saved-pool tests don't
  carry the encrypted password through the client, so SV1 pools come
  back reachable-but-unverified and SV2 pools come back as TCP-dial.
- Settings-page test toast now reflects the outcome: green success
  toast only when credentialsVerified=true, otherwise an error-styled
  toast with mode-specific text. The toaster ships only success/error
  styling, so unverified-but-reachable routes through error to keep
  visual feedback distinct.
- PoolSelectionModal mirrors PoolSelectionPage's three-callout layout
  (success, warning, danger) and surfaces ValidatePoolOutcome through
  lastTestOutcome state.

[LOW] Health-monitor settings rejected up front.
- Config.Validate now requires ProxyHealthInterval > 0 and a parseable
  ProxyHealthCheckAddr (host:port via net.SplitHostPort). Without
  this, a non-positive interval makes HealthMonitor.Start return
  immediately, HasState never flips, and effectiveProxyConfig keeps
  rejecting every proxied route — a deployment typo silently disables
  the feature.
- New tests cover non-positive interval (0 and negative), empty
  health addr, and malformed (non-host:port) addr.
Same shape as the earlier PoolSelectionPage.test.tsx fix: the new
ValidatePoolOutcome contract requires `credentialsVerified=true` for
the success callout/toast to fire. The two tests broken by the latest
push (MiningPools settings + PoolSelectionModal) were calling
onSuccess() with no args, so the unverified-but-reachable branch
fired and the success assertion failed.

Updated both mocks to pass a fully-verified SV1_AUTHENTICATE outcome.
…count check on identifier resolve

Eleventh-round Codex security review.

[MED] PreviewMiningPoolAssignment caps device count.
- The preview RPC materializes one DevicePoolPreview per targeted
  miner in a single unary response and the UI auto-fires it on every
  pool/scope edit. With no server-side cap a large-fleet preview is a
  synchronous resource-exhaustion path: full ID resolution + capability
  lookup + preflight + response materialization in memory, repeatable
  by an authenticated caller.
- Added maxPreviewDevices=1000 with INVALID_ARGUMENT when exceeded.
  The error message points operators at narrowing the selector or
  using UpdateMiningPools (which evaluates the same preflight rules
  without per-device detail in the response).

[MED] Hostname miner URLs no longer silently widen to 0.0.0.0.
- The previous fix bound the listener to the miner URL's host IP
  when that host was an IP literal, but fell back to 0.0.0.0 for
  hostnames. On a multi-homed / internet-facing host that exposed
  the unauthenticated translator on every NIC.
- Installer now fails closed when the miner URL uses a hostname:
  prompts for an explicit IPv4/IPv6 bind address (or accepts 0.0.0.0
  only when the operator deliberately types it). Already-saved
  STRATUM_V2_PROXY_DOWNSTREAM_HOST in .env is honored on rerun.

[LOW] resolveDeviceIdentifiers fails closed on row-count mismatch.
- The second lookup (internal-IDs → identifiers) ran without a count
  check, so a device disappearing between getDeviceIDs and the
  identifier resolution would silently shrink the target set. Pool
  updates would then only repoint a subset while activity logging
  still recorded the original device count.
- Mirror the include_devices strictness: return FAILED_PRECONDITION
  when the SQL row count doesn't match the input length.
…tails, image-pin doc

Twelfth-round Codex security review.

[HIGH] Preview cap no longer locks operators out of pool assignment.
- The server's >1000-device preview rejection used INVALID_ARGUMENT,
  and the client folded any preview error into hasMismatch which
  disables Save. Operators with large fleets couldn't assign pools
  through the UI at all.
- New PreviewSkipReason proto enum: SIZE_EXCEEDED returns previews=[]
  with skipped_reason=SIZE_EXCEEDED instead of an error.
  usePoolAssignmentPreview surfaces a typed `previewSkipped` flag
  separate from hasMismatch, and PoolSelectionPage shows a warning
  callout explaining commit-time preflight will catch any real
  mismatch. Save stays enabled.

[MED] Commit-path FAILED_PRECONDITION details are bounded.
- mismatchesToFailedPrecondition now caps the
  UpdateMiningPoolsMismatch detail payload at 100 entries (the rest
  are counted in the summary message but not materialized). Without
  this, a bad pool assignment against a large fleet could allocate
  thousands of protobuf details into a single Connect-RPC response.

[MED] Static SV2 capability lookup is batched.
- staticSV2CapsProvider replaced its per-device
  GetDeviceByDeviceIdentifier loop with one batched
  GetDriverNamesByDeviceIdentifiersForOrg query plus per-driver
  capability caching. A 1000-device preview now does 1 DB call
  instead of 1000+, and per-driver plugin lookups collapse to O(distinct
  drivers).

[MED] Image-pinning trade-off documented.
- ghcr.io/stratum-mining/translator stays tag-pinned for v1; can't
  verify a sha256 digest from this environment without a successful
  pull, and writing a fake digest would brick the deploy. Added an
  inline KNOWN LIMITATION comment on the compose service pointing at
  the v1 follow-up to digest-pin once the supply-chain verification
  flow is in place.
…r, dispatch-time health gate

Thirteenth-round Codex security review.

[HIGH] Translator image pinned by both tag and sha256 digest.
- The compose `image:` was `ghcr.io/stratum-mining/translator:1.5.1`,
  which doesn't actually exist (404 on GHCR). The real image lives at
  Docker Hub: `stratumv2/translator_sv2`, published by sv2-apps's
  release workflow. Switched to that path and pinned both the
  readable tag (v0.3.4) and the immutable sha256 digest. The digest
  is the trust anchor; the tag is informational. Bumping is a
  deliberate review of both fields against
  https://hub.docker.com/v2/repositories/stratumv2/translator_sv2/tags.
- Plan doc + README updated with the correct image path and the
  digest-lookup procedure.

[MED] Health probe address tracks the configured bind.
- The installer's downstream-rendering path was rewriting only the
  port of STRATUM_V2_PROXY_HEALTH_ADDR while leaving the host at
  127.0.0.1. When the operator scoped the listener to a specific NIC
  (via STRATUM_V2_PROXY_DOWNSTREAM_HOST), Fleet's TCP probe targeted
  loopback while compose published the port on a different IP — the
  probe failed forever and effectiveProxyConfig rejected every
  proxied assignment.
- Renderer now writes ${downstream_bind}:${port} into HEALTH_ADDR,
  with 0.0.0.0 falling through to 127.0.0.1 (loopback always reaches
  a wildcard-bound listener). Specific binds get probed at the same
  IP miners reach.

[MED] Dispatch worker re-checks proxy health.
- ExecutionService now mirrors the proxy URL + ProxyHealthChecker
  from command.Service. UpdateMiningPools dispatch examines the
  per-device payload's slot URLs; if any slot equals the configured
  proxy MinerURL AND the health monitor reports down (or has no
  state), it returns FAILED_PRECONDITION instead of pushing the
  payload. The queue treats this as a transient per-device failure
  so a translator outage between commit and dispatch doesn't take
  the affected miners off-pool while the rest of the batch succeeds.
…-level CRUD

Fourteenth-round Codex security review.

[MED] UpdateMiningPools rejects requests over maxCommitDevices=5000.
- The commit path materializes per-device JSON payloads in memory,
  runs preflight across the full set, and writes per-device queue rows
  in one unary RPC. Without a cap, an authenticated caller could
  spike CPU/memory by triggering a fleet-wide pool change against a
  very large org. Cap is higher than the preview cap (which exists
  because the UI auto-fires it on every edit) so legitimate fleet
  rollouts still work; operators above the cap need to scope the
  selector and run multiple updates.

[LOW] CreatePool / UpdatePool validate URL scheme.
- pools.Service relied on CEL alone for URL-scheme validation, which
  any non-Connect caller can bypass. dbProtocolFromURL() then
  silently coerced unknown schemes to 'sv1', so an imported row with
  e.g. stratum2+ssl:// would persist as SV1 and skip the SV2
  preflight from then on.
- Both methods now run rewriter.ProtocolFromURL on the supplied URL
  (UpdatePool only when r.Url != nil, matching the patch contract)
  and return INVALID_ARGUMENT on unrecognised schemes.
…Save, document config-drift limitation

Fifteenth-round Codex security review.

[MED] SV1 credential failures restore non-OK gRPC status.
- The previous round changed the SV1 path so cred failures returned
  200 OK with credentials_verified=false. That broke the v0 contract
  for any client (cached browser bundle, third-party tooling) that
  treats a fulfilled call as "validation succeeded" — invalid creds
  could be silently accepted.
- Handler now returns CodePermissionDenied when
  Mode=SV1_AUTHENTICATE and CredentialsVerified=false. SV2 paths
  still use the typed-success body (no "credentials" to verify on
  TCP_DIAL; HANDSHAKE proves identity pinning, not auth) so the
  reachable-but-unverified UX from the earlier finding stays intact
  for SV2.

[MED] Preview transport failures no longer block Save.
- usePoolAssignmentPreview's hasMismatch now keys solely off real
  per-device/per-slot warnings. A timeout / abort / 5xx during
  preview leaves error set (and the inline error UI still shows it)
  but doesn't disable Save. Commit-time preflight remains
  authoritative — a transient network blip can't lock operators out
  of urgent pool rotations now.

[HIGH/MED docs] Documented two known v1 limitations:
- Proxy upstream config has two views (`.env` for Fleet, tproxy.toml
  for the translator). install.sh derives both from a single URL;
  manual edits of one without the other are an operator footgun
  rather than a runtime bypass. Future work: have fleet-api read
  the mounted TOML at startup and refuse to start on disagreement.
- Migration 000038 backfills only stratum2+tcp:// URLs as SV2.
  Legacy rows with unsupported schemes (e.g. stratum+ssl://) stay
  at the sv1 default and are invisible to the SV2 preflight until
  an operator edits them. New writes are blocked by the
  service-level URL validation added this round.
…lback

Sixteenth-round Codex security review.

[MED] SV2 pool validation no longer downgrades to TCP reachability.
- The previous flow accepted a stratum2+tcp:// URL without a Noise
  key and fell through to sv2.TCPDial. That meant any authenticated
  caller could use ValidatePool as a generic host:port reachability
  scanner against any address reachable from the API server.
- pools.Service.ValidateConnection now rejects SV2 requests without
  a 32-byte noise_public_key (INVALID_ARGUMENT). The handshake probe
  pins identity by completing Noise NX with the supplied key; that's
  the only SV2 validation path in v1.
- Plan doc + sv2/README.md updated: Known Limitation §8 reflects the
  no-TCP-fallback contract; the README's operator-facing note now
  says the key is mandatory and explains why.
- The SV2_TCP_DIAL ValidationMode value stays in the proto for
  wire-compat, but the server no longer returns it. Stale client
  branches that switched on it are now dead code, harmless.
… in saved SV2 URLs

Seventeenth-round Codex security review.

[HIGH] Restore the no-key TCP-dial fallback for SV2 ValidatePool.
- The previous round hard-rejected SV2 ValidatePool requests without
  a Noise key, which made the shipped client (which only collects
  URL/username/password and leaves noise_public_key optional) unable
  to test SV2 pools at all.
- ValidateConnection accepts both probes again: HANDSHAKE when a
  32-byte key is supplied, TCP_DIAL otherwise. The reachability-only
  fallback is bounded by the URL CEL (only stratum2+tcp:// schemes,
  explicit port required) but is documented as a v1 trade-off in
  Known Limitation §8 — the SSRF concern from round 16 is still
  there, just acknowledged as accepted-for-v1 risk.

[MED] Saved SV2 pool URLs must include the /AUTHORITY_PUBKEY suffix.
- PoolConfig / UpdatePoolRequest / RawPoolInfo CEL rules now require
  the pubkey suffix on stratum2+tcp:// URLs (was optional). Without
  this, an operator could save a bare URL that matches a configured
  proxy upstream on host:port but disagrees on the pinned identity,
  and rewriter.sameStratumURL would reject the route at commit time
  with PROXY_UPSTREAM_MISMATCH — turning an obvious save-time bug
  into a deferred runtime failure.
- ValidatePoolRequest stays optional on the suffix because that path
  pairs the URL with an explicit noise_public_key field for the
  handshake probe.
- New Known Limitation §9 documents the suffix requirement.
…ime preflight

The read-only preview RPC paralleled the commit-time preflight to disable
Save before the user clicked. The same shared preflight already runs inside
UpdateMiningPools and rejects mismatches synchronously with FAILED_PRECONDITION
carrying the typed UpdateMiningPoolsMismatch detail, so the preview duplicated
behavior the commit path covers — and added an unbounded RPC surface that
was attracting recurring security-review findings (N+1 lookups, race
conditions, size-cap evasions). SV1 pool assignment never had a preview either.

Removes the RPC, message types, RewriteReason enum, client hook, UI callouts,
and the e2e preview subtests; keeps SlotWarning/DeviceWarning + the typed
mismatch detail (still consumed by the commit response). Updates the plan
doc to reflect the no-preview decision.
- proto/minercommand/v1/command.proto no longer references any pools.v1
  symbol after removing the SlotPreview message; buf lint flagged the
  import as unused.
- regenerated TS protobuf bindings hadn't been run through prettier; fix
  format-check.
CI's generated-code-check pulls stdlib imports above third-party imports
in the generated Go bindings; the previous regen ran goimports only on
the handful of files we directly touched. Run it across the full
generated tree so commit content matches CI output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation client dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation javascript Pull requests that update javascript code server shared

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants