Skip to content

Release: develop → main#77

Merged
jeffersonBastos merged 246 commits into
mainfrom
develop
Jun 17, 2026
Merged

Release: develop → main#77
jeffersonBastos merged 246 commits into
mainfrom
develop

Conversation

@lgahdl

@lgahdl lgahdl commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Promotes the full develop state to main (~246 commits). This is a large release consolidating the grant-review milestone work, performance hardening, and public-handover preparation. Changes are grouped by theme below; per-finding detail lives in grant-review-fix-tracking.md.

What's included

Grant review fixes (F1–F21)

All findings from the external grant review are addressed (consolidated via #102). Highlights:

  • Correctness: per-block cap + batch upsert in OrderStatusTracker; TWAP parts aged out of /by_uids recovered via /account/{owner}/orders fallback; cascade-cancelled preflight check; settlement RPC calls wrapped in withTimeout; Gnosis flash-loan tracking wired (was mainnet-only).
  • Schema/perf: composite indexes for the poll/status/stale-sweep query patterns.
  • API: on-chain hash exposed in GET /api/orders/by-owner; GET /api/sync-progress endpoint.
  • Observability: structured JSON logging; K8s liveness/readiness probe docs.

Performance improvements (COW-1017 umbrella)

Block-handler interval staggering per chain, bulk multi-row inserts, bootstrap retry/no-op handling, and chunked inserts — verified lag-free at chain tip in production.

New order types (COW-1006)

Curve/Balancer fee burners and CoW AMM constant-product pools classified as named types instead of Unknown.

Multi-chain config + engine

Per-chain config under src/chains/ derived from the CoW SDK chain catalog; Ponder upgraded to 0.16.6 with a flush/savepoint patch for multichain stability.

Production hardening (COW-1013)

Orderbook 429/5xx retry with bounded backoff (OrderbookUnavailableError, observable via ob:unavailable); container runs as non-root.

Public-handover cleanup (COW-1012)

Scrubbed internal references, removed generic AI scaffolding and dead code, retired the internal C1–C5 handler naming in favor of semantic names, and fixed doc drift.

CI / deploy

Docker image build, and production deploy is now manual-only (workflow_dispatch) — merging to main no longer triggers a destructive redeploy.

Validation

Validated against the live production instance (fully synced on both chains, /ready=200, zero decode errors across ~14.8k generators, flash-loan mappings live on Gnosis, 0 error log lines). The grant-review findings reproduce against production — see grant-review-fix-tracking.md for the per-finding "how tested" detail.

lgahdl and others added 30 commits May 28, 2026 10:20
The by_uids endpoint returns [{order: {...}}] but the code was treating
it as a flat OrderbookOrder[]. This caused order.uid to be undefined,
so fetchOrderStatusByUids returned an empty map for all candidates,
preventing C2 from ever promoting candidateDiscreteOrders to discreteOrders.
…ves (COW-977)

Replace manage.sh, deploy-remotely.sh, and static/start-db.sh with tsx scripts
and inline compose config. Adds deploy:up/down/remote pnpm scripts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (COW-978)

Extract per-chain addresses into src/chains/{mainnet,gnosis,arbitrum}.ts with
a central ACTIVE_CHAINS index. ponder.config.ts now derives all config with no
hardcoded addresses. Toggling a chain requires one line in src/chains/index.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verify the { order } unwrap fix with a real HTTP server: 6 tests covering
correct uid→status mapping, executed amounts, multiple orders, HTTP errors,
and empty responses. Adds ponder/ponder:schema vitest stubs to resolve
virtual module imports without a running Ponder process.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Use SupportedChainId from @cowprotocol/cow-sdk for chainId typing in ChainConfig
- Derive contractPollerInterval from blockTime via pollerInterval() helper (~20s target)
- Add arbitrum, base, and sepolia chain config files (cowShedFactory/gpv2Settlement
  marked null until addresses are confirmed)
- Export ALL_DEFINED_CHAINS for ORDERBOOK_API_URLS; ACTIVE_CHAINS stays mainnet+gnosis
- Make cowShedFactory nullable in ChainConfig; filter in ponder.config.ts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eteorder-fulfilled-not-promoted-to

fix: unwrap order wrapper from POST /orders/by_uids response (COW-979)
Clarify that manage.ts and deploy-remotely.ts are tailored to Bleu's
internal deployment workflow, as requested in PR review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…pts-in-deployment-with-typescriptnode-or

feat: replace deployment shell scripts with TS/Node alternatives (COW-977)
Returns per-chain historical sync progress as clean JSON:
totalBlocks, processedBlocks, progressPct (0-100), isRealtime,
isComplete. Reads from Ponder's Prometheus /metrics endpoint using
the request origin so it works on any port. Registered in OpenAPI/Swagger.

6 integration tests covering: status code, chain entries, processedBlocks
calculation, progressPct rounding, realtime/complete flags, and graceful
degradation when /metrics is unreachable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…W-985)

Add non-null assertions for Record<string, T> index access under noUncheckedIndexedAccess.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… (COW-976) (#68)

Moves deployment/docker-compose.yml to the root docker-compose.yml,
unifying dev and prod postgres into a single service. Production containers
run under the "deploy" profile. Updates manage.ts and deployment docs
accordingly. Default postgres credentials are env-var-substituted dev
fallbacks, not real secrets.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
deploy-remotely.sh was deleted by COW-977. Add Node/pnpm setup steps
and switch the run command to npx tsx deployment/deploy-remotely.ts.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Builds and pushes the image to ghcr.io on every push to main.
Tags: full git SHA (for Kubernetes to pin), branch name, and latest.
Uses GitHub Actions layer cache to speed up repeat builds.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…84) (#73)

Add "Is it working?" section explaining what to check during and after
backfill, how to distinguish stuck from slow, and what each health
endpoint signals. Addresses client feedback that the README should be
fool-proof for operators with no Ponder background.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: derive active chain descriptions from CHAIN_NAMES in data.ts (COW-982)

Add CHAIN_NAMES map to src/data.ts as single source of truth for chain
labels. ChainIdQuery description now derives from it dynamically. Update
docs/api-reference.md and docs/architecture.md to remove hardcoded
chain references and point to src/data.ts instead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: verify CHAIN_NAMES drives ChainIdQuery description (COW-982)

4 tests: CHAIN_NAMES has expected entries, all names are non-empty,
ChainIdQuery.description contains every id+name pair from CHAIN_NAMES,
and the description is not the old hardcoded string.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: derive CHAIN_NAMES from cow-sdk, drop chain-names test (COW-982)

Addresses reviewer feedback: import chain labels from getChainInfo() instead
of hardcoding, and remove the low-value chain-names unit test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
#72)

* docs: document /ready /status /metrics endpoints and k8s probes (COW-983)

Add /ready, /status, and /metrics to the endpoints table with full
descriptions. Document Kubernetes liveness/readiness probe config.
Explain /status response shape and what "stuck" looks like vs normal
backfill. Addresses client feedback about underdocumented health endpoints.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add unit tests for GET /healthz endpoint (COW-983)

3 tests: verifies 200 status, {"status":"ok"} body, and JSON content-type.
Matches the behaviour documented in docs/api-reference.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* refactor: remove redundant healthz test and probe doc sections (COW-983)

Addresses reviewer feedback: endpoint descriptions in the table are
sufficient; the dedicated sub-sections and the healthz unit test are overhead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…W-978)

Every chain in cow-sdk's ALL_SUPPORTED_CHAIN_IDS now has a config file in
src/chains/. New stubs (BNB, Polygon, Lens, Plasma, Avalanche, Ink, Linea)
follow the same pattern as existing stubs — composableCow address is the
known CREATE2 deployment; all other addresses are null with TODOs pointing
to the relevant block explorer. None are active until addresses are confirmed
(tracked in COW-986).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add entry in REST endpoints section and a dedicated subsection with
example response and field descriptions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…der-config-per-chain-files-under-srcchains' into luizhatem/cow-986-support-all-cow-protocol-chain-ids-bnb-ink-linea-etc
Fix deploy-remotely.sh/.ts, manage.sh/.ts, remove deleted static/start-db.sh,
remove duplicate API Endpoints section, update chain config references from
src/data.ts to src/chains/index.ts, and update Adding a New Chain steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…(COW-986)

Add verified ComposableCoW deployment blocks where confirmed on-chain via
cowprotocol/composable-cow networks.json cross-referenced with block explorers
and public RPC nodes. Also verify orderbook API URLs against cow-sdk
ORDER_BOOK_PROD_CONFIG source.

Verified startBlocks:
- Arbitrum (42161):  204751436 (networks.json + arbiscan)
- Base (8453):        21794150 (basescan, 2024-10-31)
- Sepolia (11155111):  5072748 (networks.json + sepolia.etherscan.io, 2024-01-12)
- BNB (56):           48433175 (networks.json + bscscan.com, 2025-04-17)
- Polygon (137):      70406888 (polygonscan.com, 2025-04-17)
- Lens (232):          3516559 (networks.json + rpc.lens.xyz, 2025-09)
- Plasma (9745):       4810535 (networks.json + rpc.plasma.to)
- Avalanche (43114):  60434336 (snowscan.xyz, 2025-04-17)
- Ink (57073):        34878187 (Blockscout API + rpc-gel.inkonchain.com)
- Linea (59144):      25028474 (networks.json + lineascan.build)

Verified orderbook API URLs (all return HTTP 200 from api.cow.fi):
- bnb, polygon, plasma, avalanche, ink, linea — confirmed active
- lens — NOT yet in cow-sdk ORDER_BOOK_PROD_CONFIG; api.cow.fi/lens returns 404

Also restore CHAIN_NAMES export in src/data.ts (removed during merge conflict
resolution) — derived from ACTIVE_CHAINS.name instead of the old getChainInfo().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rbookApiUrl to orderbookApiPath (COW-978)

- Remove contractPollerInterval field from ChainConfig; compute inline in ponder.config.ts via pollerInterval()
- Rename orderbookApiUrl -> orderbookApiPath storing only the path suffix (e.g. "mainnet", "xdai")
- Update src/data.ts to construct full URL from path: https://api.cow.fi/${c.orderbookApiPath}
- Update all 12 chain files accordingly

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…978)

Replace placeholder startBlocks (0 / rough estimates) with values verified
from cowprotocol/composable-cow networks.json and block explorers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…b-ink-linea-etc' into luizhatem/cow-978-modularize-ponder-config-per-chain-files-under-srcchains

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…2Settlement for 5 chains (COW-978)

All addresses confirmed on-chain via ROUTER() call. Deployment blocks verified
by binary search on each chain's RPC.

Chains: arbitrum, base, avalanche, linea, polygon

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…2Settlement for BNB and Plasma (COW-978)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
feat: modularize ponder config with per-chain src/chains/ (COW-978)
feat: add GET /api/sync-progress endpoint (COW-985)
docs: fix stale references and remove duplicated content (COW-987)
…d on RPC provisioning

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lgahdl and others added 21 commits June 16, 2026 00:43
Sequential batches of 50 UIDs meant 10 HTTP calls × ~1s = ~10s per
precomputed generator. For 5 ConditionalOrderCreated events in one block,
this stacked to 50s of HTTP wait time.

Fire all chunks with Promise.all so N chunks take the time of the
slowest one (~1s) instead of N × one (~10s for 500 UIDs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…th 0 orders

After fetching composable orders for an owner and finding none, mark all
generators for that owner with lastPollResult='bootstrap:noop'. The
OwnerBackfill query now excludes these generators, preventing 30-40s
sequential HTTP fetches from repeating on every indexer restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds OwnerBackfill noop optimization: skip owners already bootstrapped
with 0 orders to eliminate 30-40s block stalls on indexer restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The deploy script receives ../.env as the env file path, expecting the
file one level above the repo root. The workflow was writing it to the
repo root (.env) instead, causing scp to fail with 'no such file'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Non-interactive SSH sessions don't source the user's profile, so npx
is not found when Node.js is installed via NVM. bash -lc sources login
profile and makes the node/npx binaries available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
manage.ts requires Node.js on the remote host which is not installed.
Rewrite the deploy workflow to run docker compose commands directly via
SSH, loading env from the .env file already scp'd to the server.
deploy-remotely.ts kept for local deploys; only the CI workflow changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iner

'stop' leaves the container in stopped state; the subsequent 'up' then
fails with a name conflict. 'rm -sf' stops and removes it so 'up' can
create a fresh container with the new image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old container may have been started with docker run rather than
docker compose, making 'docker compose rm' a no-op. Grep running and
stopped containers by name prefix and force-remove them before rebuild.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The deploy script receives ../.env as the env file path, expecting the
file one level above the repo root. The workflow was writing it to the
repo root (.env) instead, causing scp to fail with 'no such file'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Non-interactive SSH sessions don't source the user's profile, so npx
is not found when Node.js is installed via NVM. bash -lc sources login
profile and makes the node/npx binaries available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
manage.ts requires Node.js on the remote host which is not installed.
Rewrite the deploy workflow to run docker compose commands directly via
SSH, loading env from the .env file already scp'd to the server.
deploy-remotely.ts kept for local deploys; only the CI workflow changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iner

'stop' leaves the container in stopped state; the subsequent 'up' then
fails with a name conflict. 'rm -sf' stops and removes it so 'up' can
create a fresh container with the new image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old container may have been started with docker run rather than
docker compose, making 'docker compose rm' a no-op. Grep running and
stopped containers by name prefix and force-remove them before rebuild.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…verflow

Large TWAPs (e.g. 3277 parts × 9 cols = 29,493 params) cause
node-postgres to silently drop bind parameters, generating:
  "bind message has 29493 parameter formats but 0 parameters"
This stalls the indexer indefinitely on the affected block.

Fix: chunk candidateRows and discreteRows at 500 rows per INSERT so
each query stays well under the practical ~29k parameter limit.

Also:
- Add PONDER_EXPERIMENTAL_DB=platform to docker-compose.yml so future
  deploys can resume from existing checkpoints without a schema drop
- Add skip_schema_drop input to deploy workflow for hot-fix deploys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
QA review (COW-1012): P0 internal-leak cleanup before public handover
final-qa → develop: grant review fixes, performance improvements, and infrastructure
@socket-security

socket-security Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedponder@​0.16.3 ⏵ 0.16.691 +3100100 +188 -7100

View full report

jeffersonBastos and others added 6 commits June 17, 2026 10:04
Orderbook API resilience (429/5xx):
- Add fetchOrderbook(): bounded retry/backoff around fetchWithTimeout that
  honors Retry-After (capped) on 429 and exponential backoff on 5xx, failing
  fast within a wall-clock budget so it never holds a block-handler TX open.
- Add OrderbookUnavailableError + ob:unavailable / ob:retry log codes so a
  rate-limited / down API is distinguishable from "order not on API yet"
  (previously both surfaced as a silent missing UID).
- Wire into fetchAccountOrders and fetchOrdersByUids; caller control flow and
  return shapes are unchanged (promotion still safely defers, now observable).

Container hardening:
- Run as the non-root `node` user (uid 1000) with chown of the workdir and
  /pnpm store. Verified: image builds, runs as uid 1000, workdir writable.

Tests: 429-then-success, persistent 429 → ob:unavailable (bounded retries),
5xx retry, empty-200 stays "absent", HTTP-date Retry-After parsing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Keep retry-then-succeed (Retry-After honored) and persistent-429 →
ob:unavailable (bounded). Drop the 5xx, empty-200, and HTTP-date cases:
5xx already flows through the pre-existing 500 tests, and empty-200 is
covered by the existing empty-array test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…silience-429backoff-run-container-as-non

COW-1013: orderbook 429/backoff resilience + non-root container
…nal naming (COW-1012 follow-up)

Follow-up to the COW-1012 public-handover QA pass. Removes generic AI agent
scaffolding that was shipping in the public tree, scrubs residual C1–C5
mechanism naming, and clears dead code + doc drift.

Removed (generic, not project-specific):
- .claude/commands/* — 12 generic commands (kept project-specific debug-ponder.md)
- .claude/agents/* — 6 generic codebase/thoughts/web-search agents
- hack/* — tracked boilerplate scripts
- opencode.json + .opencode symlink — OpenCode tool config

C1–C5 naming residue -> semantic handler names:
- blockHandler.ts: 6 withTimeout/log labels (c1/c2/c4/c5 -> OrderDiscoveryPoller/CandidateConfirmer/OwnerBackfill/CancellationWatcher)
- constants.ts: comment
- schema/tables.ts: index identifiers (generator_c1c5_poll_idx -> generator_poll_idx, discrete_order_c3_status_idx -> discrete_order_status_idx). Schema is recreated per deploy (DATABASE_SCHEMA per git-sha), so no migration is required.
- debug-ponder.md: removed stale kill-switch section referencing already-removed env flags; fixed label example

Dead code:
- removed unused pollerInterval() + its import in ponder.config.ts
- removed unused ChainConfig/SupportedChainId barrel re-exports in src/chains

Doc drift (AGENTS.md): "all supported order types" instead of stale list of 5; src/chains/ (not src/data.ts) as chain-config source; .env.example (not .env.local.example).

Verified: pnpm codegen, typecheck, lint, test (106 passing).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lowup

chore(handover): remove generic AI scaffolding + scrub residual internal naming (COW-1012 follow-up)
Merging to main previously auto-triggered deploy.yml, which (on the push
path, with no skip_schema_drop input) drops the `programmatic_orders`
schema and re-indexes from scratch — taking the synced production
instance offline for hours.

Remove the `push: main` trigger so deploys are explicit-only via
workflow_dispatch. This lets us merge develop -> main for code review
without disturbing the live, fully-synced instance; the final version is
then deployed manually (with skip_schema_drop=true to preserve progress).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@jeffersonBastos jeffersonBastos merged commit 0057cf7 into main Jun 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants