|
| 1 | +--- |
| 2 | +name: errors-api-e2e |
| 3 | +description: End-to-end smoke test for the public Errors HTTP API (error groups). Seeds failed runs into ClickHouse so the error materialized views populate, then drives the real endpoints against the running webapp — list (with filters + pagination), retrieve, resolve/ignore/unresolve, the `filter[error]` runs filter, user attribution via the `trigger.dev mint-token` -> JWT exchange, and the 401/403/404 negatives. Use for "smoke test the errors API", "test the errors API e2e", "prove the errors endpoints work", or to re-verify after changes. |
| 4 | +allowed-tools: Read, Bash |
| 5 | +--- |
| 6 | + |
| 7 | +# Errors API — end-to-end smoke test |
| 8 | + |
| 9 | +Proves the public Errors API against the **running** webapp with real HTTP. No |
| 10 | +mocks. The error data plane is ClickHouse (`errors_v1` + `error_occurrences_v1`, |
| 11 | +both materialized-view-fed from `task_runs_v2`) plus Postgres `ErrorGroupState` |
| 12 | +for lifecycle status; this skill seeds straight into `task_runs_v2` and lets the |
| 13 | +MVs do the rest. |
| 14 | + |
| 15 | +Code under test: |
| 16 | +- `apps/webapp/app/routes/api.v1.errors.ts` — `GET /api/v1/errors` (list). |
| 17 | +- `apps/webapp/app/routes/api.v1.errors.$errorId.ts` — `GET /api/v1/errors/:errorId` (detail). |
| 18 | +- `apps/webapp/app/routes/api.v1.errors.$errorId.{resolve,ignore,unresolve}.ts` — state actions. |
| 19 | +- `apps/webapp/app/presenters/v3/ApiErrorListPresenter.server.ts` / `ApiErrorGroupPresenter.server.ts`. |
| 20 | +- `apps/webapp/app/presenters/v3/ApiRunListPresenter.server.ts` — the `filter[error]` addition on `GET /api/v1/runs`. |
| 21 | +- `apps/webapp/app/v3/services/errorGroupActions.server.ts` — resolve/ignore/unresolve (nullable `userId`). |
| 22 | +- Attribution: `api.v1.projects.$projectRef.$env.jwt.ts` stamps `act:{sub}` for PAT **and** UAT exchanges; `@trigger.dev/rbac` surfaces `act.sub` through bearer auth; the action handlers read `authentication.actor?.sub`. |
| 23 | + |
| 24 | +`errorId` is `error_<fingerprint>` (round-trips via `ErrorId` in `@trigger.dev/core/v3/isomorphic`). |
| 25 | + |
| 26 | +## Prerequisites |
| 27 | + |
| 28 | +- Webapp running on http://localhost:3030 (`pnpm run dev --filter webapp`). Confirm `curl -s http://localhost:3030/healthcheck`. |
| 29 | +- DB seeded (`pnpm run db:seed`), and a local ClickHouse reachable at `CLICKHOUSE_URL` (the `pnpm run docker` stack). |
| 30 | +- The CLI built + logged in to localhost:3030 (`pnpm run build --filter trigger.dev`; profile `default` points at localhost:3030). Needed only for the attribution leg. |
| 31 | + |
| 32 | +> Important wiring facts the seed relies on (verified): |
| 33 | +> - The MVs read the error type/message from `error.data.*`, so the seeded |
| 34 | +> `error` JSON column **must** be wrapped: `{"data": {"type": ..., "message": ..., "stack": ...}}`. |
| 35 | +> - The MVs only fire for failed statuses: `SYSTEM_FAILURE | CRASHED | INTERRUPTED | COMPLETED_WITH_ERRORS | TIMED_OUT`, and require a non-empty `error_fingerprint`. |
| 36 | +> - `GET /api/v1/runs` lists run **ids** from ClickHouse but **hydrates from Postgres** `TaskRun`. So the error-list/detail/action legs work from a ClickHouse-only seed, but the `filter[error]` leg needs a **paired** Postgres `TaskRun` row whose `id` equals the ClickHouse `run_id`. |
| 37 | +
|
| 38 | +Run everything from the repo root in one shell. Invoke the built CLI via a |
| 39 | +function (a `CLI="node …"` variable won't word-split under zsh): |
| 40 | +```bash |
| 41 | +cli() { node packages/cli-v3/dist/esm/index.js "$@"; } |
| 42 | +PROFILE=default |
| 43 | +``` |
| 44 | + |
| 45 | +## Setup — resolve a dev environment + connection strings |
| 46 | + |
| 47 | +```bash |
| 48 | +cd apps/webapp |
| 49 | +CHURL=$(grep -E "^CLICKHOUSE_URL=" .env | head -1 | cut -d= -f2- | tr -d '"') |
| 50 | +DBURL=$(grep -E "^DATABASE_URL=" .env | head -1 | cut -d= -f2- | tr -d '"' | tr -d "'" | sed 's/?.*//') |
| 51 | + |
| 52 | +# Pick the seeded hello-world dev env (proj_rrkpdguyagvsoktglnod). Adjust the |
| 53 | +# WHERE if you want a different project. |
| 54 | +read ENV ORG PROJ REF < <(psql "$DBURL" -t -A -F' ' -c " |
| 55 | + SELECT re.id, re.\"organizationId\", re.\"projectId\", p.\"externalRef\" |
| 56 | + FROM \"RuntimeEnvironment\" re |
| 57 | + JOIN \"Project\" p ON p.id = re.\"projectId\" |
| 58 | + WHERE re.slug='dev' AND p.\"externalRef\"='proj_rrkpdguyagvsoktglnod' LIMIT 1;") |
| 59 | +APIKEY=$(psql "$DBURL" -t -A -c "SELECT \"apiKey\" FROM \"RuntimeEnvironment\" WHERE id='$ENV';") |
| 60 | +cd .. |
| 61 | +H="Authorization: Bearer $APIKEY" |
| 62 | +B="http://localhost:3030" |
| 63 | +``` |
| 64 | + |
| 65 | +## Steps |
| 66 | + |
| 67 | +### 1. Seed two error groups (ClickHouse, MV-fed) |
| 68 | + |
| 69 | +```bash |
| 70 | +RUN=$(node -e 'console.log(Date.now().toString(36))') |
| 71 | +TASK="errors-api-e2e-$RUN"; FP_A="fpA${RUN}"; FP_B="fpB${RUN}" |
| 72 | +ERRID_A="error_$FP_A"; ERRID_B="error_$FP_B" |
| 73 | +NOW_CH=$(node -e 'console.log(new Date().toISOString().replace("T"," ").replace("Z","").slice(0,23))') |
| 74 | +NOW_MS=$(node -e 'console.log(Date.now())') |
| 75 | +Q=$(python3 -c "import urllib.parse;print(urllib.parse.quote('INSERT INTO trigger_dev.task_runs_v2 FORMAT JSONEachRow'))") |
| 76 | + |
| 77 | +mkrow() { # status fingerprint errorType message runId |
| 78 | + echo "{\"environment_id\":\"$ENV\",\"organization_id\":\"$ORG\",\"project_id\":\"$PROJ\",\"run_id\":\"$5\",\"friendly_id\":\"run_$5\",\"status\":\"$1\",\"environment_type\":\"DEVELOPMENT\",\"engine\":\"V2\",\"task_identifier\":\"$TASK\",\"created_at\":\"$NOW_CH\",\"updated_at\":\"$NOW_CH\",\"error\":{\"data\":{\"type\":\"$3\",\"message\":\"$4\",\"stack\":\"at x (a.ts:1:1)\"}},\"error_fingerprint\":\"$2\",\"task_version\":\"20240101.1\",\"_version\":\"$NOW_MS\",\"_is_deleted\":0}" |
| 79 | +} |
| 80 | +ROWS="$(mkrow COMPLETED_WITH_ERRORS $FP_A AlphaBoom 'alpha boom happened' r_a1_$RUN) |
| 81 | +$(mkrow COMPLETED_WITH_ERRORS $FP_A AlphaBoom 'alpha boom happened' r_a2_$RUN) |
| 82 | +$(mkrow CRASHED $FP_B BetaCrash 'beta crash happened' r_b1_$RUN)" |
| 83 | +printf '%s' "$ROWS" | curl -s "$CHURL/?query=$Q" --data-binary @- |
| 84 | + |
| 85 | +# Poll until both fingerprints appear in errors_v1 (the MV is near-instant locally). |
| 86 | +for i in $(seq 1 10); do |
| 87 | + N=$(curl -s "$CHURL" --data-binary "SELECT count() FROM (SELECT 1 FROM trigger_dev.errors_v1 WHERE environment_id='$ENV' AND error_fingerprint IN ('$FP_A','$FP_B') GROUP BY error_fingerprint)") |
| 88 | + [ "$N" = "2" ] && break; sleep 1 |
| 89 | +done |
| 90 | +echo "seeded fingerprints in errors_v1: $N (want 2)" |
| 91 | +``` |
| 92 | +PASS: `N = 2`. Alpha has 2 occurrences, beta 1. |
| 93 | + |
| 94 | +### 2. List + filters + pagination |
| 95 | + |
| 96 | +```bash |
| 97 | +curl -s "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK&filter%5Bperiod%5D=1d" -H "$H" \ |
| 98 | + | python3 -c "import sys,json;d=json.load(sys.stdin);print('count',len(d['data']),[(e['id'],e['status'],e['count']) for e in d['data']])" |
| 99 | +``` |
| 100 | +PASS: 2 groups, both `status=unresolved`, alpha `count=2`, beta `count=1`, ids `error_<fp>`. |
| 101 | + |
| 102 | +Assert each filter narrows correctly (each should return the noted shape): |
| 103 | +```bash |
| 104 | +curl -s "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK&filter%5Bstatus%5D=unresolved&filter%5Bperiod%5D=1d" -H "$H" | python3 -c "import sys,json;print('unresolved:',len(json.load(sys.stdin)['data']))" # 2 |
| 105 | +curl -s "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK&filter%5Bsearch%5D=AlphaBoom&filter%5Bperiod%5D=1d" -H "$H" | python3 -c "import sys,json;print('search:',[e['errorType'] for e in json.load(sys.stdin)['data']])" # ['AlphaBoom'] |
| 106 | +curl -s "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK&filter%5Bperiod%5D=1d&page%5Bsize%5D=1" -H "$H" | python3 -c "import sys,json;d=json.load(sys.stdin);print('page size 1:',len(d['data']),'next?',bool(d['pagination'].get('next')))" # 1 / True |
| 107 | +``` |
| 108 | +PASS: `unresolved: 2`, `search: ['AlphaBoom']`, `page size 1: 1 / next? True`. |
| 109 | + |
| 110 | +### 3. Retrieve detail |
| 111 | + |
| 112 | +```bash |
| 113 | +curl -s "$B/api/v1/errors/$ERRID_A" -H "$H" \ |
| 114 | + | python3 -c "import sys,json;d=json.load(sys.stdin);print(d['id'],d['errorType'],d['status'],d['count'],d['affectedVersions'],d['resolvedBy'])" |
| 115 | +``` |
| 116 | +PASS: `error_<fpA> AlphaBoom unresolved 2 ['20240101.1'] None`. |
| 117 | + |
| 118 | +### 4. Resolve / ignore / unresolve (env API key — `resolvedBy` null) |
| 119 | + |
| 120 | +```bash |
| 121 | +st(){ python3 -c "import sys,json;d=json.load(sys.stdin);print('status',d['status'],'| resolvedInVersion',d['resolvedInVersion'],'| resolvedBy',d['resolvedBy'],'| ignoredUntil',bool(d['ignoredUntil']),'| reason',d['ignoredReason'])"; } |
| 122 | + |
| 123 | +curl -s -X POST "$B/api/v1/errors/$ERRID_A/resolve" -H "$H" -H 'Content-Type: application/json' -d '{"resolvedInVersion":"20240101.1"}' >/dev/null |
| 124 | +curl -s "$B/api/v1/errors/$ERRID_A" -H "$H" | st # status resolved | resolvedInVersion 20240101.1 | resolvedBy None |
| 125 | + |
| 126 | +curl -s -X POST "$B/api/v1/errors/$ERRID_B/ignore" -H "$H" -H 'Content-Type: application/json' -d '{"duration":3600000,"reason":"known flake"}' >/dev/null |
| 127 | +curl -s "$B/api/v1/errors/$ERRID_B" -H "$H" | st # status ignored | ignoredUntil True | reason known flake |
| 128 | + |
| 129 | +curl -s -X POST "$B/api/v1/errors/$ERRID_A/unresolve" -H "$H" >/dev/null |
| 130 | +curl -s "$B/api/v1/errors/$ERRID_A" -H "$H" | st # status unresolved |
| 131 | +``` |
| 132 | +PASS: each transition reflected; `filter[status]=ignored` returns only beta: |
| 133 | +```bash |
| 134 | +curl -s "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK&filter%5Bstatus%5D=ignored&filter%5Bperiod%5D=1d" -H "$H" | python3 -c "import sys,json;print([e['id'] for e in json.load(sys.stdin)['data']])" # [error_<fpB>] |
| 135 | +``` |
| 136 | + |
| 137 | +### 5. `filter[error]` on the runs list (paired PG + CH seed) |
| 138 | + |
| 139 | +The runs list hydrates from Postgres, so seed a matching `TaskRun` row + a CH row |
| 140 | +that share `run_id`/`id` and carry a fingerprint: |
| 141 | +```bash |
| 142 | +RID="re2e${RUN}"; FRID="run_${RID}"; FP_R="fpR${RUN}" |
| 143 | +psql "$DBURL" -v ON_ERROR_STOP=1 -c " |
| 144 | + INSERT INTO \"TaskRun\" (id, \"friendlyId\", \"taskIdentifier\", payload, \"traceId\", \"spanId\", \"runtimeEnvironmentId\", \"projectId\", queue, status, \"createdAt\", \"updatedAt\") |
| 145 | + VALUES ('$RID','$FRID','$TASK','{}','trace_$RID','span_$RID','$ENV','$PROJ','task/$TASK','COMPLETED_WITH_ERRORS', now(), now()) |
| 146 | + ON CONFLICT (id) DO NOTHING;" >/dev/null |
| 147 | +ROW="{\"environment_id\":\"$ENV\",\"organization_id\":\"$ORG\",\"project_id\":\"$PROJ\",\"run_id\":\"$RID\",\"friendly_id\":\"$FRID\",\"status\":\"COMPLETED_WITH_ERRORS\",\"environment_type\":\"DEVELOPMENT\",\"engine\":\"V2\",\"task_identifier\":\"$TASK\",\"created_at\":\"$NOW_CH\",\"updated_at\":\"$NOW_CH\",\"error\":{\"data\":{\"type\":\"RunsFilterErr\",\"message\":\"for runs filter\",\"stack\":\"at x\"}},\"error_fingerprint\":\"$FP_R\",\"task_version\":\"20240101.1\",\"_version\":\"$NOW_MS\",\"_is_deleted\":0}" |
| 148 | +printf '%s' "$ROW" | curl -s "$CHURL/?query=$Q" --data-binary @- |
| 149 | +sleep 1 |
| 150 | +curl -s "$B/api/v1/runs?filter%5Berror%5D=error_$FP_R" -H "$H" | python3 -c "import sys,json;d=json.load(sys.stdin);print('runs:',[r['id'] for r in d['data']])" |
| 151 | +``` |
| 152 | +PASS: one run, `run_<RID>` (status maps to `FAILED`). Proves `filter[error]` -> fingerprint -> CH -> PG hydration. |
| 153 | + |
| 154 | +### 6. Attribution — `mint-token` -> JWT exchange records the acting user |
| 155 | + |
| 156 | +```bash |
| 157 | +TOKEN=$(cli mint-token --profile $PROFILE --client errors-api-e2e 2>/dev/null) # UAT |
| 158 | +ENVJWT=$(curl -sS -X POST "$B/api/v1/projects/$REF/dev/jwt" -H "Authorization: Bearer $TOKEN" \ |
| 159 | + -H 'Content-Type: application/json' -d '{"claims":{"scopes":["read:errors","write:errors"]}}' \ |
| 160 | + | python3 -c "import sys,json;print(json.load(sys.stdin)['token'])") |
| 161 | +# Decoded env JWT carries act.sub = the user id. |
| 162 | +node -e 'const p=JSON.parse(Buffer.from(process.argv[1].split(".")[1],"base64url").toString());console.log("act:",JSON.stringify(p.act))' "$ENVJWT" |
| 163 | + |
| 164 | +curl -s -X POST "$B/api/v1/errors/$ERRID_A/resolve" -H "Authorization: Bearer $ENVJWT" \ |
| 165 | + -H 'Content-Type: application/json' -d '{"resolvedInVersion":"20240101.2"}' >/dev/null |
| 166 | +curl -s "$B/api/v1/errors/$ERRID_A" -H "$H" | python3 -c "import sys,json;d=json.load(sys.stdin);print('resolvedBy:',d['resolvedBy'])" |
| 167 | +``` |
| 168 | +PASS: `act.sub` is the user id (matches `cli whoami`), and `detail.resolvedBy` equals that user id (not null). A plain env key leaves it null (step 4). A **PAT** exchanged the same way also stamps `act` — repeat with the stored PAT to confirm `ignoredByUserId` attribution. |
| 169 | + |
| 170 | +### 7. Negatives |
| 171 | + |
| 172 | +```bash |
| 173 | +curl -s -o /dev/null -w 'unknown id: %{http_code} (404)\n' "$B/api/v1/errors/error_doesnotexist0000" -H "$H" |
| 174 | +curl -s -o /dev/null -w 'no auth list: %{http_code} (401)\n' "$B/api/v1/errors" |
| 175 | +curl -s -o /dev/null -w 'no auth resolve: %{http_code} (401)\n' -X POST "$B/api/v1/errors/$ERRID_B/resolve" -H 'Content-Type: application/json' -d '{}' |
| 176 | + |
| 177 | +# read-only JWT must be denied on write, allowed on read |
| 178 | +READJWT=$(curl -sS -X POST "$B/api/v1/projects/$REF/dev/jwt" -H "Authorization: Bearer $TOKEN" \ |
| 179 | + -H 'Content-Type: application/json' -d '{"claims":{"scopes":["read:errors"]}}' | python3 -c "import sys,json;print(json.load(sys.stdin)['token'])") |
| 180 | +curl -s -o /dev/null -w 'read JWT write: %{http_code} (403)\n' -X POST "$B/api/v1/errors/$ERRID_B/resolve" -H "Authorization: Bearer $READJWT" -H 'Content-Type: application/json' -d '{}' |
| 181 | +curl -s -o /dev/null -w 'read JWT read: %{http_code} (200)\n' "$B/api/v1/errors?filter%5BtaskIdentifier%5D=$TASK" -H "Authorization: Bearer $READJWT" |
| 182 | +``` |
| 183 | +PASS: `404`, `401`, `401`, `403`, `200` respectively. |
| 184 | + |
| 185 | +## Result |
| 186 | + |
| 187 | +Report PASS only if: step 1 lands 2 groups in `errors_v1`; step 2's filters and |
| 188 | +pagination narrow correctly; step 3 returns the detail; step 4's resolve/ignore/ |
| 189 | +unresolve flip status (and `filter[status]` follows); step 5's `filter[error]` |
| 190 | +returns the paired run; step 6 records `resolvedBy` = the acting user via the |
| 191 | +JWT exchange (null with a plain env key); and step 7 returns 404/401/401/403/200. |
| 192 | +A red leg is a bug or a missing prereq — report the exact status + body and file |
| 193 | +a Linear issue, don't tune around it. |
| 194 | + |
| 195 | +## Notes / gotchas |
| 196 | + |
| 197 | +- Run files use a unique `$RUN` suffix per invocation, so reruns don't collide and seeded rows stay isolated by their unique task identifier. They are local-dev test rows (90-day ClickHouse TTL); no cleanup required. |
| 198 | +- After **adding** the route files, the classic Remix dev compiler may not register them until a dev-server restart (a stale manifest returns Remix's HTML 404 on the new paths). If `POST …/resolve` returns a 404 HTML page rather than 401/200, restart `pnpm run dev --filter webapp`. |
| 199 | +- The rbac `act` extraction lives in `@trigger.dev/rbac` (a built dep). After editing it, `pnpm run build --filter @trigger.dev/rbac` and restart the webapp so the attribution leg (step 6) reflects the change. |
0 commit comments