Skip to content

final-qa → develop: grant review fixes, performance improvements, and infrastructure#102

Merged
jeffersonBastos merged 116 commits into
developfrom
luizhatem/final-qa
Jun 17, 2026
Merged

final-qa → develop: grant review fixes, performance improvements, and infrastructure#102
jeffersonBastos merged 116 commits into
developfrom
luizhatem/final-qa

Conversation

@lgahdl

@lgahdl lgahdl commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

93 commits ahead of develop, covering all grant review findings (F1–F21), five performance improvements, three new order types, two data model fixes, a Ponder engine patch, and a full CI/deploy pipeline rewrite.


Grant Review Fixes

Correctness

  • F1 (COW-988): OrderStatusTracker now batch-upserts open discrete orders instead of N sequential await update() calls; added MAX_DISCRETE_ORDERS_PER_BLOCK per-block cap (default 200)
  • F11 (COW-989): stale TWAP candidates whose orderbook records aged out of /by_uids now fall back to /account/{owner}/orders before deletion — recovers fulfilled parts that would otherwise be permanently lost
  • F16 (COW-990): parent-cancelled cascade now runs a pre-flight /by_uids check before writing status=cancelled, catching the ~0.17% case where a solver filled the order between generator creation and cancellation
  • F8 (COW-991): wrapped all four RPC calls in settlement.ts (getTransactionReceipt, getCode, call, readContract) with withTimeout; moved settlement work inline into the event handler (eliminates block-handler queue)
  • F9 (COW-991): wired GPv2Settlement registration for gnosis — was mainnet-only despite gnosis having gpv2Settlement + flashLoanRouter addresses in src/data.ts
  • F12/F13 (COW-992): excluded stale (past-validTo) candidates from the /by_uids confirmation batch to reduce per-block API load; documented dedicated RPC as a hard production requirement
  • F15 (COW-993): added hash field to the generator object in GET /api/orders/by-owner REST response
  • F4 (COW-997): removed dead orderbook_cache table from setup.ts (live cache is order_uid_cache in cow_cache schema)
  • F6 (COW-998): replaced raw SQL string interpolation for UID cache reads with parameterized Drizzle eq() queries
  • F2 (COW-999): synced DETERMINISTIC_ORDER_TYPES to include CirclesBackingOrder, matching all types handled by uidPrecompute.ts
  • F7 (COW-1000): renamed all five block handlers to semantic names (OrderDiscoveryPoller, CandidateConfirmer, OrderStatusTracker, OwnerBackfill, CancellationWatcher)
  • F5 (COW-886): added three composite indexes on conditional_order_generator (C1/C5 poll pattern, C3 open-order pattern) and candidate_discrete_order (C2 stale-sweep pattern)
  • F19 (COW-994): structured JSON logging via cowLog helper; documented /healthz for liveness vs /ready for readiness with K8s manifest snippet
  • F21 (COW-1001/1002): removed hardcoded counts from docs/architecture.md; fixed stale "Five tables" → dynamic description (schema now has six tables)

Data Model

  • COW-1007: typed ComposableOrder.creationDate as bigint to match on-chain timestamp precision
  • COW-1008: derived isComplete from isRealtime && historicalBlocksFetchedPct === 100 instead of the unreliable ponder_sync_is_complete field

New Order Types (COW-1006)

Classified Curve/Balancer fee burners and CoW AMM constant-product pools as named order types (CurveCowSwapBurner, BalancerCowSwapFeeBurner, CowAmmConstantProduct) instead of Unknown. These are non-deterministic single-shot types handled by C1.


Performance Improvements (COW-1017 umbrella)

  • COW-1014: uniform interval: 10 on gnosis (50 s window) and interval: 4 on mainnet (48 s window) — eliminates handler-stacking lag from all handlers firing on the same block every 5 s
  • COW-1015: capped fetchComposableOrders at 4 pages × 25 orders with signingScheme=eip1271 filter; added BOOTSTRAP_MAX_RETRY_COUNT=5 to abandon permanently timing-out owners
  • COW-1016: replaced N sequential DB inserts with single multi-row upserts using excluded.*; parallelized /by_uids HTTP chunks with Promise.all
  • COW-1018: OwnerBackfill now marks owners with 0 composable orders as lastPollResult='bootstrap:noop' and skips them on future restarts — eliminates 30–40 s block stalls
  • COW-1019: chunked candidateRows/discreteRows inserts at 500 rows per INSERT — fixes node-postgres silent bind-parameter drop above ~29k params (stalled re-index for >1 h at 29.3%)

Verified in production (2026-06-16): gnosis lag ≤ 4 blocks, heavy event blocks 1.3–2.3 s.


Ponder Engine Patch (COW-1009)

Upgraded Ponder from 0.16.3 → 0.16.6 and applied two patches:

  • Unique savepoint names in flush(): prevents ROLLBACK TO SAVEPOINT crashes when multiple handlers flush concurrently in multichain mode
  • qb snapshot at flush start: prevents mid-flush external mutation of the query builder in multichain handlers

Also refactored settlement adapter discovery: inlined into the GPv2Settlement:Settlement event handler and removed the SettlementResolver block handler (eliminated a source of context.db.sql calls that triggered unwanted flush() invocations).


CI/Deploy Pipeline

  • Rewrote deploy.yml to run docker commands directly via SSH — server has no Node.js so manage.ts could not run remotely
  • Added monitor_only input: check server status without deploying (gh workflow run deploy.yml --ref <branch> -f monitor_only=true)
  • Added skip_schema_drop input: resume from existing checkpoint without re-indexing (PONDER_EXPERIMENTAL_DB=platform)
  • Added PONDER_EXPERIMENTAL_DB: platform permanently to docker-compose.yml
  • Increased ponder container nofile ulimit to 65,536

Test plan

  • pnpm typecheck passes
  • pnpm lint passes
  • All grant review findings validated in production — see thoughts/grant-review-fix-tracking.md
  • Deployed to production on 2026-06-16; indexer live at chain tip on both chains
  • Gnosis lag ≤ 4 blocks sustained over hours of live monitoring
  • OwnerBackfill noop: verify OwnerBackfill:owner_noop on first restart after COW-1018 deploys fresh, then OwnerBackfill:no_bootstrap_needed on the subsequent restart

🤖 Generated with Claude Code

lgahdl and others added 30 commits June 1, 2026 15:47
… dedup (COW-1004)

- Lift magic literal 100000 to MAX_TWAP_PRECOMPUTE_PARTS in constants.ts; import
  it in uidPrecompute.ts so the threshold is documented and grep-able
- Export *_ABI constants from all decoder source files; update tests/decoders/decoders.test.ts
  to import them instead of duplicating inline ABI fragments
- Fix "mainnet and gnosis" hardcode in docs/architecture.md to say "all active chains"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace N sequential per-order update() calls with a single multi-row
upsert, matching the C2 pattern. Add MAX_DISCRETE_ORDERS_PER_BLOCK cap
(default 200, env-var override per chainId) to bound /by_uids batch size
and keep block handler transactions short.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…AT on-chain caveat)

- Replace "mainnet and Gnosis Chain" with "all active chains" in architecture.md overview
  and supported-order-types.md header (consistent with line 229 already fixed)
- Add TWAP edge-case note: n > MAX_TWAP_PRECOMPUTE_PARTS skips precompute → C1 fallback
- Add GoodAfterTime caveat: decoder is unit-tested but no real on-chain order observed yet

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds tests for DEFAULT_MAX_DISCRETE_ORDERS_PER_BLOCK (200) and all other
constants in src/constants.ts; also adds a pure-logic test for the
VALID_DISCRETE_STATUSES filter used by the C3 StatusUpdater batch upsert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… endpoint tests

Merged GeneratorSummary schema tests (COW-993) with ordersByOwnerHandler
endpoint integration tests (COW-995). Activated the COW-993 hash regression
test that was marked .todo in cow-995 (fix is now present in final-qa).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rding

COW-1004 already updated this line to "all active chains"; kept that over
the older "mainnet and gnosis" phrasing from COW-1001/1002.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…into final-qa

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…to final-qa

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ot-namespace conflict)

Ponder 0.16.x treats dots in block interval names as namespace separators,
causing ponder.on('composableCow.OrderDiscoveryPoller:block') to fail validation
because it tries to resolve 'composableCow' as a contract name.

Renamed all five intervals to their bare semantic names (no prefix):
  OrderDiscoveryPoller, CandidateConfirmer, OrderStatusTracker,
  OwnerBackfill, CancellationWatcher.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… 985, 978, 987)

Brings in chain modularization (src/chains/), multi-chain support stubs,
sync-progress endpoint, Docker publish workflow, K8s probe docs, and
operational README improvements.

Conflict resolutions:
- ponder.config.ts: use develop's dynamic ACTIVE_CHAINS with COW-1000 semantic names
- docs/architecture.md: keep semantic handler names, adopt new chain-adding guide
- docs/deployment.md: keep K8s probes + structured logging sections
- vitest.config.ts: keep ponder:api mock alias for endpoint tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Vite's file watcher exhausts the default Docker ulimit of 1024 open
files when the project includes many chain files and node_modules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… API

Makes the field name self-documenting — it clearly tracks historical
block-fetch progress (not handler completion), consistent with the
distinction between progressPct=100 and isComplete=false.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…decode removal

COW-991 removed the decodeAbiParameters block (orderUid, sellToken, buyToken,
sellAmount, buyAmount were decode-only-for-logging). COW-994's cowLog migration
kept those variables in the log fields — crashing the indexer at runtime.
Drop the undefined fields from the log; adapter + eoa are sufficient.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…cument per-block cap (COW-988)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…row coverage (COW-995)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… precompute (COW-996/999/1003)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…COW-993)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-1000)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndler (COW-991)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e DiscreteStatus type (COW-990)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… K8s probes (COW-994)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…celled-cascade-preflight-check' into luizhatem/final-qa

# Conflicts:
#	src/application/handlers/blockHandler.ts
…er tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…crash on malformed entries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lgahdl and others added 20 commits June 16, 2026 00:37
Three functions were doing N sequential await insert() calls — one DB
roundtrip per row. For a TWAP with 500 parts this meant 500 × ~28ms =
~14s per event handler, the root cause of 30-55s event blocks growing
the gnosis lag.

- uidPrecompute.ts: collect discreteOrder and candidateDiscreteOrder rows
  into two arrays, then one bulk insert per table with excluded.* in set
- orderbookClient.ts upsertDiscreteOrders: one multi-row upsert with
  excluded.* instead of N individual context.db.insert() calls
- orderbookClient.ts cacheUidStatuses: one multi-row upsert instead of
  N individual inserts with per-row try/catch

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sequential batches of 50 UIDs meant 10 HTTP calls × ~1s = ~10s per
precomputed generator. For 5 ConditionalOrderCreated events in one block,
this stacked to 50s of HTTP wait time.

Fire all chunks with Promise.all so N chunks take the time of the
slowest one (~1s) instead of N × one (~10s for 500 UIDs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…th 0 orders

After fetching composable orders for an owner and finding none, mark all
generators for that owner with lastPollResult='bootstrap:noop'. The
OwnerBackfill query now excludes these generators, preventing 30-40s
sequential HTTP fetches from repeating on every indexer restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds OwnerBackfill noop optimization: skip owners already bootstrapped
with 0 orders to eliminate 30-40s block stalls on indexer restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The deploy script receives ../.env as the env file path, expecting the
file one level above the repo root. The workflow was writing it to the
repo root (.env) instead, causing scp to fail with 'no such file'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Non-interactive SSH sessions don't source the user's profile, so npx
is not found when Node.js is installed via NVM. bash -lc sources login
profile and makes the node/npx binaries available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
manage.ts requires Node.js on the remote host which is not installed.
Rewrite the deploy workflow to run docker compose commands directly via
SSH, loading env from the .env file already scp'd to the server.
deploy-remotely.ts kept for local deploys; only the CI workflow changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iner

'stop' leaves the container in stopped state; the subsequent 'up' then
fails with a name conflict. 'rm -sf' stops and removes it so 'up' can
create a fresh container with the new image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old container may have been started with docker run rather than
docker compose, making 'docker compose rm' a no-op. Grep running and
stopped containers by name prefix and force-remove them before rebuild.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The deploy script receives ../.env as the env file path, expecting the
file one level above the repo root. The workflow was writing it to the
repo root (.env) instead, causing scp to fail with 'no such file'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Non-interactive SSH sessions don't source the user's profile, so npx
is not found when Node.js is installed via NVM. bash -lc sources login
profile and makes the node/npx binaries available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
manage.ts requires Node.js on the remote host which is not installed.
Rewrite the deploy workflow to run docker compose commands directly via
SSH, loading env from the .env file already scp'd to the server.
deploy-remotely.ts kept for local deploys; only the CI workflow changed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…iner

'stop' leaves the container in stopped state; the subsequent 'up' then
fails with a name conflict. 'rm -sf' stops and removes it so 'up' can
create a fresh container with the new image.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old container may have been started with docker run rather than
docker compose, making 'docker compose rm' a no-op. Grep running and
stopped containers by name prefix and force-remove them before rebuild.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…verflow

Large TWAPs (e.g. 3277 parts × 9 cols = 29,493 params) cause
node-postgres to silently drop bind parameters, generating:
  "bind message has 29493 parameter formats but 0 parameters"
This stalls the indexer indefinitely on the affected block.

Fix: chunk candidateRows and discreteRows at 500 rows per INSERT so
each query stays well under the practical ~29k parameter limit.

Also:
- Add PONDER_EXPERIMENTAL_DB=platform to docker-compose.yml so future
  deploys can resume from existing checkpoints without a schema drop
- Add skip_schema_drop input to deploy workflow for hot-fix deploys

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@socket-security

socket-security Bot commented Jun 16, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Updatedponder@​0.16.3 ⏵ 0.16.691 +3100100 +188 -7100

View full report

QA review (COW-1012): P0 internal-leak cleanup before public handover
@jeffersonBastos jeffersonBastos merged commit c6e5de5 into develop Jun 17, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants