Skip to content

Add analytics page telemetry + QA data correctness runbook#244

Merged
gregagi merged 2 commits intomainfrom
forge/analytics-page-telemetry-qa-checklist
Mar 19, 2026
Merged

Add analytics page telemetry + QA data correctness runbook#244
gregagi merged 2 commits intomainfrom
forge/analytics-page-telemetry-qa-checklist

Conversation

@gregagi
Copy link
Collaborator

@gregagi gregagi commented Mar 19, 2026

Summary

  • instrumented Analytics page v1 telemetry in analytics_dashboard_controller.js for:
    • analytics_page_viewed
    • analytics_date_range_changed
    • analytics_refresh_clicked
    • analytics_source_error_shown
  • added these events to canonical taxonomy (core/analytics/event_taxonomy.json) with required properties
  • added docs for QA data correctness checklist + production troubleshooting (docs/analytics-page-data-correctness-checklist.md)
  • updated event docs/matrix and changelog
  • added regression tests for taxonomy coverage + controller telemetry references

Testing

  • ./.venv/bin/pytest -q core/tests/test_posthog_event_coverage.py core/tests/test_analytics_dashboard_telemetry.py core/tests/test_analytics_events.py
  • ⚠️ ./.venv/bin/pytest -q core/tests/test_project_analytics_view.py core/tests/test_project_analytics_api.py (blocked in this runtime: Postgres host db not resolvable)

Summary by CodeRabbit

  • New Features

    • Analytics page now captures usage telemetry including page views, date range changes, refresh actions, and error states.
  • Documentation

    • Added analytics page QA checklist for data correctness validation and troubleshooting guidance.
    • Updated event taxonomy documentation with new analytics page tracking events.
  • Tests

    • Added validation tests for analytics event capture and telemetry coverage.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 19, 2026

Warning

Rate limit exceeded

@gregagi has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 15 minutes and 49 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c8bcdebf-7e46-4d83-804e-dea0e550688e

📥 Commits

Reviewing files that changed from the base of the PR and between 83d6af9 and 87369f7.

📒 Files selected for processing (4)
  • core/analytics/event_taxonomy.json
  • core/tests/test_analytics_dashboard_telemetry.py
  • docs/posthog-event-coverage-matrix.md
  • frontend/src/controllers/analytics_dashboard_controller.js
📝 Walkthrough

Walkthrough

This PR adds PostHog event telemetry instrumentation to the Analytics page. It introduces four new analytics events (page viewed, date range changed, refresh clicked, source error shown) to the event taxonomy, implements event emission in the frontend controller with deduplication logic, and provides QA documentation and test coverage for the new telemetry.

Changes

Cohort / File(s) Summary
Event Taxonomy Definition
core/analytics/event_taxonomy.json
Added four new analytics events (analytics_page_viewed, analytics_date_range_changed, analytics_refresh_clicked, analytics_source_error_shown) with required properties including project ID, date range values, and error/source metadata.
Test Coverage
core/tests/test_analytics_dashboard_telemetry.py, core/tests/test_posthog_event_coverage.py
New test file validates PostHog event capture in the analytics controller; updated coverage test now asserts existence of four new analytics events in the taxonomy.
Documentation Updates
CHANGELOG.md, docs/event-taxonomy.md, docs/posthog-event-coverage-matrix.md, docs/analytics-page-data-correctness-checklist.md
Added changelog entries; updated canonical events list and coverage matrix with new analytics events; added comprehensive QA checklist covering access control, date range behavior, KPI correctness, partial integrations, and PostHog telemetry verification with troubleshooting playbooks.
Frontend Telemetry Implementation
frontend/src/controllers/analytics_dashboard_controller.js
Added instance state for range tracking and error deduplication; modified load() and renderError() signatures to support telemetry parameters; introduced helper methods (toRangeFingerprint(), captureDateRangeChangedIfNeeded(), currentRangeTelemetry(), captureEvent()); integrated event capture throughout page load, date range changes, refresh, and error handling with deduplication logic.

Sequence Diagram

sequenceDiagram
    participant User
    participant Controller as analytics_dashboard_controller.js
    participant API as Analytics API
    participant PostHog

    User->>Controller: Load analytics page
    Controller->>Controller: Initialize state & compute default preset
    Controller->>PostHog: Emit analytics_page_viewed
    Controller->>API: load() with triggerSource
    API-->>Controller: API response
    
    User->>Controller: Change date range (preset or custom)
    Controller->>Controller: Check if range fingerprint changed
    Controller->>PostHog: Emit analytics_date_range_changed (if changed)
    Controller->>API: load() with triggerSource
    
    User->>Controller: Click refresh
    Controller->>PostHog: Emit analytics_refresh_clicked
    Controller->>API: load() with triggerSource
    API-->>Controller: Error or source health data
    
    alt API Error or Source Error
        Controller->>Controller: Attach telemetry metadata to error
        Controller->>PostHog: Emit analytics_source_error_shown (deduplicated per fingerprint)
        Controller->>Controller: renderError() with telemetry context
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related PRs

Poem

🐰✨ A rabbit hops through analytics skies,
PostHog events capture each user's tries,
Page views and date ranges in telemetry flight,
Errors deduped with fingerprint might!
QA checklists thumping, the dashboards now bright! 🎯

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main objectives: adding analytics page telemetry instrumentation and creating a QA data correctness runbook/checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch forge/analytics-page-telemetry-qa-checklist
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR instruments four PostHog telemetry events on the Analytics page (analytics_page_viewed, analytics_date_range_changed, analytics_refresh_clicked, analytics_source_error_shown), registers them in the canonical event taxonomy, and ships a QA runbook + updated docs. The implementation is well-structured with a fingerprint-based deduplication guard for source errors and a clean captureEvent wrapper that gracefully no-ops when PostHog is unavailable.

Key issues found:

  • Negative range_days values: rangeDays() returns 0 or a negative integer when endDate < startDate (e.g. a user accidentally swaps the dates). This value propagates to all telemetry events that spread currentRangeTelemetry() before the API rejects the request, silently polluting PostHog data.
  • Event ordering in refresh(): analytics_refresh_clicked fires before analytics_date_range_changed in the refresh flow. While both fire before the API call, downstream funnel analysis in PostHog would see refresh_clicked → date_range_changed, which is the reverse of the logical sequence.
  • Taxonomy documentation gap: source_status is sent in all analytics_source_error_shown calls and is explicitly referenced in the production runbook as a diagnostic property, but it is absent from the event's required_properties in event_taxonomy.json.
  • Shallow telemetry test: test_analytics_dashboard_telemetry.py only asserts string presence in the JS source file and would not catch regressions in the fingerprint deduplication logic, event sequencing, or property correctness.

Confidence Score: 3/5

  • Safe to merge with minor telemetry-quality fixes recommended before the data is relied upon for funnel analysis.
  • The core instrumentation is sound and follows existing patterns. The negative range_days issue is a real data-quality bug that would silently emit malformed events to PostHog when a user accidentally enters an inverted date range. The event-ordering quirk and taxonomy gap are minor but worth addressing to keep the documented contract consistent with what is actually captured.
  • frontend/src/controllers/analytics_dashboard_controller.js (negative range_days + event ordering) and core/analytics/event_taxonomy.json (missing source_status in required_properties).

Important Files Changed

Filename Overview
frontend/src/controllers/analytics_dashboard_controller.js Core telemetry instrumentation added; rangeDays can return negative/zero values for inverted date ranges, and analytics_refresh_clicked fires before analytics_date_range_changed in the refresh flow, producing out-of-order events in PostHog.
core/analytics/event_taxonomy.json Four new events added with correct stages and required properties; analytics_source_error_shown is missing source_status from its required_properties despite the controller always sending it and the runbook referencing it.
core/tests/test_analytics_dashboard_telemetry.py New test file; coverage is limited to string-presence checks in the JS source and does not validate runtime behaviour such as fingerprint deduplication or event ordering.
core/tests/test_posthog_event_coverage.py Four new analytics events added to the P1 coverage set; existing test structure unchanged and correct.

Sequence Diagram

sequenceDiagram
    participant User
    participant Controller as analytics_dashboard_controller.js
    participant PostHog as window.posthog
    participant API as /api/projects/.../analytics/aggregation

    User->>Controller: connect() [page load]
    Controller->>PostHog: analytics_page_viewed {project_id, date_range_start, date_range_end, range_days}
    Controller->>API: fetch(start_date, end_date)
    API-->>Controller: payload or error

    alt API success
        Controller->>Controller: render(payload) → renderSourceHealth(rows)
        loop Each source row with last_error (deduplicated by fingerprint)
            Controller->>PostHog: analytics_source_error_shown {project_id, source, source_status, error_message, result_status}
        end
    else API failure
        Controller->>PostHog: analytics_source_error_shown {project_id, source=dashboard_api, source_status, error_message, result_status}
    end

    User->>Controller: applyPreset(event)
    Controller->>PostHog: analytics_date_range_changed {project_id, date_range_start, date_range_end, range_days, change_source=preset_click}
    Controller->>API: fetch(start_date, end_date)

    User->>Controller: refresh(event)
    Controller->>PostHog: analytics_refresh_clicked {project_id, date_range_start, date_range_end, range_days}
    Note over Controller,PostHog: analytics_date_range_changed fires here only if range changed
    Controller->>PostHog: analytics_date_range_changed {change_source=custom_date_refresh}
    Controller->>API: fetch(start_date, end_date)
Loading

Last reviewed commit: "Add analytics page t..."

Comment on lines +393 to +405
rangeDays(startDate, endDate) {
if (!startDate || !endDate) {
return null;
}

const start = new Date(startDate);
const end = new Date(endDate);
if (Number.isNaN(start.getTime()) || Number.isNaN(end.getTime())) {
return null;
}

return Math.floor((end.getTime() - start.getTime()) / DAY) + 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 rangeDays can return 0 or a negative value

When endDate is earlier than startDate, Math.floor((end - start) / DAY) + 1 yields 0 or a negative integer. This value propagates into every telemetry event that spreads currentRangeTelemetry() (e.g. analytics_page_viewed, analytics_refresh_clicked, analytics_date_range_changed), quietly tainting PostHog with invalid range_days data.

While the API call would also fail for an inverted range, the telemetry fires before the API response, so bad events would be recorded. Consider clamping or nullifying the return value:

Suggested change
rangeDays(startDate, endDate) {
if (!startDate || !endDate) {
return null;
}
const start = new Date(startDate);
const end = new Date(endDate);
if (Number.isNaN(start.getTime()) || Number.isNaN(end.getTime())) {
return null;
}
return Math.floor((end.getTime() - start.getTime()) / DAY) + 1;
}
rangeDays(startDate, endDate) {
if (!startDate || !endDate) {
return null;
}
const start = new Date(startDate);
const end = new Date(endDate);
if (Number.isNaN(start.getTime()) || Number.isNaN(end.getTime())) {
return null;
}
const days = Math.floor((end.getTime() - start.getTime()) / DAY) + 1;
return days > 0 ? days : null;
}

Comment on lines 83 to +98
async refresh(event) {
if (event) {
event.preventDefault();
}
this.setPresetUi("custom");
await this.load();

this.captureEvent("analytics_refresh_clicked", {
...this.currentRangeTelemetry(),
trigger_source: "refresh_button",
});

this.captureDateRangeChangedIfNeeded({
change_source: "custom_date_refresh",
});

await this.load({ triggerSource: "refresh_button" });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 analytics_refresh_clicked fires before analytics_date_range_changed

In the refresh() flow, analytics_refresh_clicked is emitted first (line 89), then captureDateRangeChangedIfNeeded emits analytics_date_range_changed second (line 94). The taxonomy description for analytics_date_range_changed says it fires "before a new query load", and technically both events do fire before load(), but the resulting event stream in PostHog will show refresh_clicked → date_range_changed — the reverse of the conceptual order (range changes first, then refresh is triggered).

For cleaner funnel sequencing in analytics, consider emitting date_range_changed before refresh_clicked:

Suggested change
async refresh(event) {
if (event) {
event.preventDefault();
}
this.setPresetUi("custom");
await this.load();
this.captureEvent("analytics_refresh_clicked", {
...this.currentRangeTelemetry(),
trigger_source: "refresh_button",
});
this.captureDateRangeChangedIfNeeded({
change_source: "custom_date_refresh",
});
await this.load({ triggerSource: "refresh_button" });
async refresh(event) {
if (event) {
event.preventDefault();
}
this.setPresetUi("custom");
this.captureDateRangeChangedIfNeeded({
change_source: "custom_date_refresh",
});
this.captureEvent("analytics_refresh_clicked", {
...this.currentRangeTelemetry(),
trigger_source: "refresh_button",
});
await this.load({ triggerSource: "refresh_button" });
}

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +345 to 354
"analytics_source_error_shown": {
"stage": "risk",
"description": "Analytics page surfaced a provider or API error to the user.",
"required_properties": [
"project_id",
"source",
"error_message",
"result_status"
]
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 source_status sent by controller but absent from taxonomy required_properties

The controller sends source_status in both code paths that fire analytics_source_error_shown:

  • renderError(...) — passes source_status (e.g., HTTP status code or "fetch_failed")
  • renderSourceHealth(...) — passes source_status (mapped from row.status)

The production runbook in docs/analytics-page-data-correctness-checklist.md also explicitly lists source_status as a field to inspect when troubleshooting repeated error toasts. It is therefore a de-facto required diagnostic property, but it is not listed in required_properties here. This creates a gap between what the event actually carries and what the taxonomy documents.

Consider adding it:

Suggested change
"analytics_source_error_shown": {
"stage": "risk",
"description": "Analytics page surfaced a provider or API error to the user.",
"required_properties": [
"project_id",
"source",
"error_message",
"result_status"
]
}
"analytics_source_error_shown": {
"stage": "risk",
"description": "Analytics page surfaced a provider or API error to the user.",
"required_properties": [
"project_id",
"source",
"source_status",
"error_message",
"result_status"
]
}

Comment on lines +1 to +16
from pathlib import Path


def test_analytics_dashboard_controller_tracks_v1_analytics_events():
controller_path = Path(__file__).resolve().parents[2] / "frontend" / "src" / "controllers" / "analytics_dashboard_controller.js"
source = controller_path.read_text(encoding="utf-8")

assert "window.posthog.capture" in source
assert '"analytics_page_viewed"' in source
assert '"analytics_date_range_changed"' in source
assert '"analytics_refresh_clicked"' in source
assert '"analytics_source_error_shown"' in source
assert "project_id" in source
assert "date_range_start" in source
assert "date_range_end" in source
assert "range_days" in source
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Telemetry test only performs string-presence checks

The test verifies that event-name strings and a few property names appear somewhere in the controller source file. This means it would pass even if those strings existed in comments, and it would not catch:

  • analytics_date_range_changed firing when the date range has not changed (fingerprint logic regression)
  • analytics_source_error_shown firing multiple times for the same error in a single render cycle (deduplication regression)
  • analytics_page_viewed firing more than once per connect() call

Because these are the behaviours that matter for PostHog data quality (the whole point of this PR), consider adding at least one JS unit test (e.g. with Vitest/Jest) that mounts the controller with a stub window.posthog and exercises the fingerprint guard and deduplication logic. The current approach gives a false sense of coverage while leaving the actual runtime behaviour untested.

@gregagi
Copy link
Collaborator Author

gregagi commented Mar 19, 2026

Addressed Greptile feedback in follow-up commit 87369f7:

  • rangeDays() now returns null for inverted ranges (prevents negative telemetry values)
  • refresh flow now emits analytics_date_range_changed before analytics_refresh_clicked
  • added source_status to analytics_source_error_shown required taxonomy properties
  • strengthened telemetry test coverage with ordering + inverted-range guard + taxonomy assertion

@gregagi gregagi merged commit d7d3ec4 into main Mar 19, 2026
3 checks passed
@gregagi gregagi deleted the forge/analytics-page-telemetry-qa-checklist branch March 19, 2026 04:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant