diff --git a/.claude/skills/triage/SKILL.md b/.claude/skills/triage/SKILL.md new file mode 100644 index 000000000..567bb9841 --- /dev/null +++ b/.claude/skills/triage/SKILL.md @@ -0,0 +1,154 @@ +--- +name: triage +description: > + Fetch the top open Bugsnag production issue, investigate with evidence from + stack traces / logs / breadcrumbs, propose a fix direction, route through + domain experts, and write a lean review brief. +allowed-tools: + - Bash + - Read + - Glob + - Grep + - Write + - Edit + - Agent + - WebFetch + - Skill +--- + +# Triage: daily Bugsnag issue investigation + +You are a senior Android engineer performing daily Bugsnag triage for the +Flipcash Android app. Follow the steps below exactly. + +## Step 1 — Fetch the top open issue + +Run the helper script: + +```bash +bash .claude/skills/triage/scripts/bugsnag-top.sh +``` + +The script emits a single JSON object with: + +| Field | Description | +|-------|-------------| +| `error_id` | Bugsnag error group ID | +| `error_url` | Deep link into the Bugsnag dashboard | +| `event_url` | REST API URL for the latest event | +| `title` | Error class + message | +| `severity` | `error` / `warning` / `info` | +| `events` | Total occurrences | +| `users` | Unique affected users | +| `first_seen` | ISO-8601 timestamp | +| `release` | `app_version` from the latest event | + +If the script exits non-zero, stop and report the error to the user. + +## Step 2 — Pull event detail + +Fetch `event_url` (include header `Authorization: token $BUGSNAG_TOKEN`). +Source `.env` from the repo root if the variable is not already set. + +Parse the response using the event shape documented in +`.claude/skills/triage/references/event-shape.md`. + +Extract four evidence sources: + +1. **Stack trace** — `exceptions[0].stacktrace` (frames with `file`, `method`, + `lineNumber`, `inProject`) +2. **Exception info** — `exceptions[0].errorClass` + `exceptions[0].message` +3. **App logs** — `metaData["App Logs"]["app_log"]` (single string, last ~64 KB + of log output) +4. **Breadcrumbs** — `breadcrumbs[]` (timestamped UI / state / network events) + +## Step 3 — Map stack frames to source + +Android stack frames use Java/Kotlin package-qualified class names (e.g. +`com.flipcash.features.cash.CashViewModel`). To locate the source file: + +1. Convert the class name to a path fragment: replace `.` with `/` and append + `.kt` (try `.java` as a fallback). +2. Use `Glob` to find the file in the repo (e.g. `**/**/CashViewModel.kt`). +3. Read the relevant lines (`lineNumber` from the frame +/- 30 lines of + context). + +Only map frames where `inProject` is `true`. + +## Step 4 — Build the evidence timeline + +### 4a. Version check + +Read `.well-known/release-manifest.json` to get the current production and +internal release versions: + +```json +{ + "tracks": { + "production": { "versionCode": 3508, "versionName": "2026.5.2" }, + "internal": { "versionCode": 3508, "versionName": "2026.5.2" } + } +} +``` + +Compare the event's `app.version` against `tracks.production.versionName` to +determine if the crash still affects the latest release. + +### 4b. Assemble evidence + +Collect: + +- The in-project stack frames mapped to source (file:line + code snippet) +- The exception class and message +- Relevant log lines (grep the `app_log` string for keywords from the exception) +- The last 10-20 breadcrumbs before the crash +- `app.version`, `device.manufacturer`, `device.model`, `os.version` + +## Step 5 — Investigate root cause + +Using the evidence from Step 4: + +1. Read the source files identified in the stack trace. +2. Follow the call chain — read callers and callees within 2 hops. +3. Check for known patterns: null-safety violations, lifecycle issues, + threading bugs, uncaught coroutine exceptions, missing error handling. +4. Form a hypothesis and verify it against the logs and breadcrumbs. + +## Step 6 — Propose a fix direction + +Write a concrete fix direction (NOT a full implementation): + +- Which file(s) to change and roughly where (`.kt:NN`) +- What the fix involves (e.g. "add null check before accessing X", + "move coroutine launch to lifecycleScope", "catch Y in Z") +- Why this addresses the root cause +- Any risks or side effects + +## Step 7 — Route through domain experts + +Based on the evidence, tag relevant experts by adding their labels to the brief. +An issue may match multiple experts. + +| Expert | Trigger | +|--------|---------| +| `compose` | Touches files with `@Composable`, `Modifier`, `remember`, `LaunchedEffect`, or under `ui/`, `features/*/ui/` | +| `kotlin-coroutines` | Stack contains `CoroutineScope`, `Dispatchers`, `suspend`, `launch`, `async`, `withContext`, `Job`, `SupervisorJob` | +| `android-tdd` | Proposed fix direction adds or modifies test files | +| `kotlin-flows` | Stack or fix involves `Flow`, `StateFlow`, `SharedFlow`, `collect`, `stateIn` | + +## Step 8 — Write the review brief + +Use the template in `.claude/skills/triage/references/brief-template.md`. + +Save the brief to `.claude/plans/triage-.md`. + +Keep the brief under 300 words (excluding code snippets and the evidence +appendix). + +## Step 9 — Next steps + +If the fix direction is clear and well-scoped, offer to draft an implementation +plan using `superpowers:writing-plans`. + +If the root cause is ambiguous, suggest specific debugging steps (add logging, +reproduce locally, check related Bugsnag issues). diff --git a/.claude/skills/triage/references/brief-template.md b/.claude/skills/triage/references/brief-template.md new file mode 100644 index 000000000..4ea6caaa3 --- /dev/null +++ b/.claude/skills/triage/references/brief-template.md @@ -0,0 +1,34 @@ +# Triage Brief: {{title}} + +| Field | Value | +|-------|-------| +| **Bugsnag** | [{{error_id}}]({{error_url}}) | +| **Severity** | {{severity}} | +| **Events / Users** | {{events}} / {{users}} | +| **First seen** | {{first_seen}} | +| **Release** | {{release}} | +| **Production versionName** | {{production_version}} | +| **Experts** | {{experts}} | + +## Root Cause + +{{1-3 sentences explaining the root cause, referencing specific `.kt:NN` locations}} + +## Evidence + +- **Exception**: `{{errorClass}}`: {{message}} +- **Key stack frame**: `{{file.kt:NN}}` — `{{method}}` +- **Log excerpt**: `{{relevant log line(s)}}` +- **Breadcrumb trail**: {{last N breadcrumbs summarized}} +- **Device**: {{manufacturer}} {{model}}, Android {{os_version}} + +## Fix Direction + +{{Concrete description of what to change and where (`.kt:NN`), why it fixes the +issue, and any risks. NOT a full implementation — just the direction.}} + +## Appendix: Stack Trace (in-project frames) + +``` +{{mapped in-project frames with file:line and method}} +``` diff --git a/.claude/skills/triage/references/event-shape.md b/.claude/skills/triage/references/event-shape.md new file mode 100644 index 000000000..47696fb53 --- /dev/null +++ b/.claude/skills/triage/references/event-shape.md @@ -0,0 +1,100 @@ +# Bugsnag Event Shape — Android + +A single event fetched from `GET /events/{event_id}` (with project auth). + +## Four Evidence Sources + +### 1. App Logs — `metaData["App Logs"]["app_log"]` + +A single string containing the last ~64 KB of log output captured at crash time. +Attached by `FlipcashBugsnagErrorCallback` in +`apps/flipcash/app/src/main/kotlin/com/flipcash/app/internal/debug/FlipcashBugsnagErrorCallback.kt`. + +Unlike iOS's structured `metaData.app_logs.recent_logs` array, this is +unstructured text. Grep it for keywords from the exception to find relevant +context. + +### 2. Stack Trace — `exceptions[0].stacktrace` + +Array of frame objects: + +```json +{ + "file": "com/flipcash/features/cash/CashViewModel.kt", + "method": "com.flipcash.features.cash.CashViewModel.loadBalance", + "lineNumber": 42, + "inProject": true, + "columnNumber": null +} +``` + +- `file` — path-like representation of the Kotlin/Java source file using + package-qualified slashes (e.g. `com/flipcash/features/cash/CashViewModel.kt`) +- `method` — fully qualified method name with package +- `lineNumber` — source line (may be approximate after R8/ProGuard) +- `inProject` — `true` for app code, `false` for framework / library code + +**Path mapping**: Android frames use Java package paths. To find the source +file, either: +- Convert to a glob: `**/CashViewModel.kt` and search the repo +- Or convert dots to slashes and search: `com/flipcash/features/cash/CashViewModel.kt` + +### 3. Breadcrumbs — `breadcrumbs[]` + +Array of timestamped events (same structure as iOS): + +```json +{ + "timestamp": "2026-05-10T14:23:01.000Z", + "name": "Navigate to CashScreen", + "type": "navigation", + "metaData": { "route": "/cash" } +} +``` + +Types correspond to `BreadcrumbType` values: `ERROR`, `LOG`, `NAVIGATION`, +`REQUEST`, `PROCESS`, `STATE`, `USER` (see `BugsnagBreadcrumbSink`). + +### 4. Exception Info — `exceptions[0]` + +```json +{ + "errorClass": "java.lang.NullPointerException", + "message": "Attempt to invoke virtual method 'void ...' on a null object reference", + "type": "android" +} +``` + +Replaces iOS's `nserror` concept. The `errorClass` is the Java/Kotlin exception +class name; `message` is the detail string. + +For Kotlin-specific exceptions, look for: +- `kotlin.KotlinNullPointerException` +- `kotlinx.coroutines.JobCancellationException` +- `java.util.concurrent.CancellationException` +- `IllegalStateException` (often lifecycle-related) + +## Secondary Context + +| Path | Notes | +|------|-------| +| `app.version` | versionName (e.g. `2026.5.3`) | +| `app.versionCode` | Integer version code | +| `app.releaseStage` | `production` / `development` | +| `device.manufacturer` | e.g. `Samsung`, `Google` | +| `device.model` | e.g. `Pixel 8`, `SM-S918B` | +| `device.osVersion` | Android version string (e.g. `14`) | +| `device.totalMemory` | Total RAM in bytes | +| `device.freeMemory` | Free RAM at crash time | +| `user.id` | Anonymized user identifier | +| `session` | Session start, events handled/unhandled | +| `featureFlags[]` | Active feature flags at crash time | + +## Filtering Noise + +The app's error callback (`FlipcashBugsnagErrorCallback`) already filters: +- gRPC status codes in `ErrorUtils.ignoredGrpcStatusCodes` (transport, validation) +- Handled gRPC `INTERNAL` errors + +So events that reach Bugsnag are either unhandled crashes or explicitly notified +errors that passed the filter. diff --git a/.claude/skills/triage/scripts/bugsnag-top.sh b/.claude/skills/triage/scripts/bugsnag-top.sh new file mode 100755 index 000000000..3e26278ef --- /dev/null +++ b/.claude/skills/triage/scripts/bugsnag-top.sh @@ -0,0 +1,176 @@ +#!/usr/bin/env bash +# bugsnag-top.sh — Fetch the top open Bugsnag error for Flipcash Android. +# +# Emits a single JSON object on stdout with the fields needed by the triage skill. +# Requires BUGSNAG_TOKEN (personal auth token) in the environment or in .env. +# +# Usage: +# ./bugsnag-top.sh # top open error, production, since latest release +# ./bugsnag-top.sh --all # top open error, production, all versions +# ./bugsnag-top.sh --since 2026-05-01T00:00:00Z # custom since filter +# ./bugsnag-top.sh --severity error # filter by severity + +set -euo pipefail + +# ── Source .env from repo root ───────────────────────────────────────── +SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)" +REPO_ROOT="$(cd "$SCRIPT_DIR/../../../.." && pwd)" + +if [[ -f "$REPO_ROOT/.env" ]]; then + # shellcheck source=/dev/null + source "$REPO_ROOT/.env" +fi +if [[ -f "$REPO_ROOT/.env.local" ]]; then + # shellcheck source=/dev/null + source "$REPO_ROOT/.env.local" +fi + +# ── Configuration ────────────────────────────────────────────────────── +BROWSER_BASE="https://app.bugsnag.com/${BUGSNAG_ORG_SLUG}/${BUGSNAG_PROJECT_SLUG}/errors" +API_BASE="https://api.bugsnag.com" + +if [[ -z "${BUGSNAG_TOKEN:-}" ]]; then + echo "ERROR: BUGSNAG_TOKEN is not set. Add it to .env or export it." >&2 + exit 1 +fi + +if [[ -z "${BUGSNAG_PROJECT_ID:-}" ]]; then + echo "ERROR: BUGSNAG_PROJECT_ID is not set. Add it to .env or export it." >&2 + exit 1 +fi + +PROJECT_ID="$BUGSNAG_PROJECT_ID" +# ────────────────────────────────────────────────────────────────────── + +# ── Read release manifest ───────────────────────────────────────────── +MANIFEST="$REPO_ROOT/.well-known/release-manifest.json" +if [[ -f "$MANIFEST" ]]; then + PROD_VERSION=$(jq -r '.tracks.production.versionName // "unknown"' "$MANIFEST") + PROD_CODE=$(jq -r '.tracks.production.versionCode // "unknown"' "$MANIFEST") + INTERNAL_VERSION=$(jq -r '.tracks.internal.versionName // "unknown"' "$MANIFEST") + INTERNAL_CODE=$(jq -r '.tracks.internal.versionCode // "unknown"' "$MANIFEST") + MANIFEST_UPDATED=$(jq -r '.updated // ""' "$MANIFEST") +else + PROD_VERSION="unknown" + PROD_CODE="unknown" + INTERNAL_VERSION="unknown" + INTERNAL_CODE="unknown" + MANIFEST_UPDATED="" +fi + +# ── Argument parsing ────────────────────────────────────────────────── +RELEASE_STAGE="production" +SEVERITY="" +SINCE="" +ALL_VERSIONS=false + +while [[ $# -gt 0 ]]; do + case "$1" in + --internal) RELEASE_STAGE="production"; SINCE=""; ALL_VERSIONS=true; shift ;; + --severity) SEVERITY="$2"; shift 2 ;; + --all) ALL_VERSIONS=true; shift ;; + --since) SINCE="$2"; shift 2 ;; + *) echo "Unknown option: $1" >&2; exit 1 ;; + esac +done + +# Default: filter to events since the latest production release +if [[ "$ALL_VERSIONS" == false && -z "$SINCE" && -n "$MANIFEST_UPDATED" ]]; then + SINCE="$MANIFEST_UPDATED" +fi + +# ── Helpers ─────────────────────────────────────────────────────────── +api() { + local url="$1"; shift + local attempt=0 max_attempts=3 delay=2 + + while (( attempt < max_attempts )); do + local http_code body + body=$(curl -s --globoff -w "\n%{http_code}" \ + -H "Authorization: token $BUGSNAG_TOKEN" \ + -H "Accept: application/json" \ + "$url" "$@" 2>/dev/null) || true + + http_code=$(echo "$body" | tail -1) + body=$(echo "$body" | sed '$d') + + case "$http_code" in + 200) echo "$body"; return 0 ;; + 429) + attempt=$((attempt + 1)) + sleep "$delay" + delay=$((delay * 2)) + ;; + *) + echo "ERROR: API returned HTTP $http_code for $url" >&2 + return 1 + ;; + esac + done + + echo "ERROR: Rate-limited after $max_attempts attempts" >&2 + return 1 +} + +# ── Fetch top open error ────────────────────────────────────────────── +FILTERS="filters[error.status][]=open&filters[app.release_stage][]=${RELEASE_STAGE}" +if [[ -n "$SEVERITY" ]]; then + FILTERS="${FILTERS}&filters[event.severity][]=${SEVERITY}" +fi +if [[ -n "$SINCE" ]]; then + FILTERS="${FILTERS}&filters[event.since][]=${SINCE}" +fi + +ERRORS_URL="${API_BASE}/projects/${PROJECT_ID}/errors?${FILTERS}&sort=events&direction=desc&per_page=1" +ERRORS_JSON=$(api "$ERRORS_URL") + +if [[ -z "$ERRORS_JSON" ]] || ! echo "$ERRORS_JSON" | jq -e '.[0]' >/dev/null 2>&1; then + echo "No open errors found for release_stage=${RELEASE_STAGE}" >&2 + exit 1 +fi + +ERROR_ID=$(echo "$ERRORS_JSON" | jq -r '.[0].id') +TITLE=$(echo "$ERRORS_JSON" | jq -r '.[0] | (.error_class // "") + ": " + (.message // "")') +SEVERITY_VAL=$(echo "$ERRORS_JSON" | jq -r '.[0].severity') +EVENTS=$(echo "$ERRORS_JSON" | jq -r '.[0].events') +USERS=$(echo "$ERRORS_JSON" | jq -r '.[0].users') +FIRST_SEEN=$(echo "$ERRORS_JSON" | jq -r '.[0].first_seen') + +# ── Fetch latest event for this error ───────────────────────────────── +EVENTS_URL="${API_BASE}/projects/${PROJECT_ID}/errors/${ERROR_ID}/events?sort=timestamp&direction=desc&per_page=1" +EVENTS_JSON=$(api "$EVENTS_URL") + +EVENT_ID=$(echo "$EVENTS_JSON" | jq -r '.[0].id') +EVENT_URL="${API_BASE}/projects/${PROJECT_ID}/errors/${ERROR_ID}/events/${EVENT_ID}" +RELEASE=$(echo "$EVENTS_JSON" | jq -r '.[0].app.version // "unknown"') + +# ── Emit result ─────────────────────────────────────────────────────── +jq -n \ + --arg error_id "$ERROR_ID" \ + --arg error_url "${BROWSER_BASE}/${ERROR_ID}" \ + --arg event_url "$EVENT_URL" \ + --arg title "$TITLE" \ + --arg severity "$SEVERITY_VAL" \ + --arg events "$EVENTS" \ + --arg users "$USERS" \ + --arg first_seen "$FIRST_SEEN" \ + --arg release "$RELEASE" \ + --arg prod_version "$PROD_VERSION" \ + --arg prod_code "$PROD_CODE" \ + --arg internal_version "$INTERNAL_VERSION" \ + --arg internal_code "$INTERNAL_CODE" \ + '{ + error_id: $error_id, + error_url: $error_url, + event_url: $event_url, + title: $title, + severity: $severity, + events: ($events | tonumber), + users: ($users | tonumber), + first_seen: $first_seen, + release: $release, + current_versions: { + production: { versionName: $prod_version, versionCode: ($prod_code | tonumber? // $prod_code) }, + internal: { versionName: $internal_version, versionCode: ($internal_code | tonumber? // $internal_code) } + } + }' diff --git a/.gitignore b/.gitignore index e089ee19f..d8d13418b 100644 --- a/.gitignore +++ b/.gitignore @@ -18,7 +18,11 @@ apps/flipcash/app/dapp_publishing/node_modules/** apps/flipcash/app/dapp_publishing/.asset-manifest.json ## Fastlane fastlane/report.xml + *.env.* +.env +.env.local + .claude/worktrees/ .claude/settings.local.json