From a3e3e9ee6515a2aeacd40ea118116221e99817e0 Mon Sep 17 00:00:00 2001 From: Daniel Beer Date: Tue, 3 Mar 2026 23:49:13 +0200 Subject: [PATCH] feat: add 5 new analytics agents Add five new agents to support the full analytics workflow: 1. Data Validation Agent (agents/data-validation/SKILL.md) - Validates data quality in BigQuery tables/queries - Detects NULLs, gaps, invalid values, schema mismatches - Classifies issues by severity (Critical/Warning/Info) - Provides actionable recommendations 2. Context Helper (Events) Agent (agents/context-helper/SKILL.md) - Finds and explains events for features - Searches event registry, code, and Figma - Provides SQL usage examples - Shows related events and flows 3. Presentation Agent (agents/presentation/SKILL.md) - Creates presentation decks from analytics findings - Formats for stakeholder presentations, QBRs - Structures narrative with visuals - Supports multiple formats (Slides, PPT, PDF, Markdown) 4. Release Agent (agents/release/SKILL.md) - Tracks and analyzes feature releases - Monitors adoption, usage patterns, impact - Compares performance vs goals - Provides release reports and ongoing monitoring 5. User Research Agent (agents/user-research/SKILL.md) - Conducts quantitative user research - Creates user personas and segments - Analyzes user journeys and pain points - Identifies power users and churn signals Updated CLAUDE.md and README.md with routing and documentation. Co-Authored-By: Claude Sonnet 4.5 --- CLAUDE.md | 5 + README.md | 86 ++++++- agents/context-helper/SKILL.md | 231 +++++++++++++++++++ agents/data-validation/SKILL.md | 249 ++++++++++++++++++++ agents/presentation/SKILL.md | 338 +++++++++++++++++++++++++++ agents/release/SKILL.md | 327 ++++++++++++++++++++++++++ agents/user-research/SKILL.md | 396 ++++++++++++++++++++++++++++++++ 7 files changed, 1627 insertions(+), 5 deletions(-) create mode 100644 agents/context-helper/SKILL.md create mode 100644 agents/data-validation/SKILL.md create mode 100644 agents/presentation/SKILL.md create mode 100644 agents/release/SKILL.md create mode 100644 agents/user-research/SKILL.md diff --git a/CLAUDE.md b/CLAUDE.md index 63eef82..ea65fee 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,6 +12,11 @@ When the user makes a request, identify which agent should handle it, read its S |---|---|---| | Create a dashboard, build a dashboard, set up analytics for a feature | **Dashboard Builder** | `agents/dashboard-builder/SKILL.md` | | Create a Linear issue, track work in Linear, file a bug/task | **Linear Issue Manager** | `agents/linear/SKILL.md` | +| Validate data quality, check for NULLs/gaps/invalid values, verify schema | **Data Validation** | `agents/data-validation/SKILL.md` | +| Find events for a feature, explain event meaning, search event registry | **Context Helper (Events)** | `agents/context-helper/SKILL.md` | +| Create presentation, turn findings into slides, format for stakeholders | **Presentation** | `agents/presentation/SKILL.md` | +| Track feature release, analyze launch performance, monitor adoption | **Release** | `agents/release/SKILL.md` | +| Segment users, create personas, analyze user journeys, identify churn patterns | **User Research** | `agents/user-research/SKILL.md` | If the request doesn't clearly match an agent, ask the user which they need. diff --git a/README.md b/README.md index 422c3a6..acc21af 100644 --- a/README.md +++ b/README.md @@ -20,9 +20,11 @@ agents/ ← One folder per agent templates/dashboard-spec.md ← Phase 2 output template linear/ ← Creates and manages Linear issues for analytics work SKILL.md ← 4-phase flow: Gather → Confirm → Create → Report - monitoring/ ← (Planned) Anomaly detection and alerts - enterprise-reporter/ ← (Planned) Enterprise account reports - prompt-reviewer/ ← (Planned) Prompt quality review + data-validation/SKILL.md ← Validates data quality and detects issues + context-helper/SKILL.md ← Finds and explains events from registry + presentation/SKILL.md ← Creates presentation decks from findings + release/SKILL.md ← Tracks and analyzes feature releases + user-research/SKILL.md ← Conducts quantitative user research ``` ## How It Works @@ -180,10 +182,84 @@ Creates feature dashboards in Hex Threads with a 4-phase workflow: See `agents/dashboard-builder/SKILL.md` for full workflow. +### Data Validation + +**Status:** NEW | **Trigger:** "Validate data quality", "Check for NULLs/gaps", "Verify schema" + +Validates data quality in BigQuery tables and query results. Detects issues before they affect analysis. + +**What it validates:** +- NULL checks, date gaps, invalid values +- Schema validation, row count checks +- Duplicates, referential integrity +- Business logic violations + +See `agents/data-validation/SKILL.md` for full workflow. + +### Context Helper (Events) + +**Status:** NEW | **Trigger:** "What events does [feature] track?", "Explain event [name]" + +Helps find and explain events for analytics work. Searches event registry, code, and Figma. + +**What it provides:** +- Event discovery by feature or keyword +- Event definitions and context +- SQL usage examples +- Related events and flows + +See `agents/context-helper/SKILL.md` for full workflow. + +### Presentation + +**Status:** NEW | **Trigger:** "Create presentation", "Turn findings into slides" + +Creates presentation decks and slides from analytics findings for stakeholder presentations. + +**What it creates:** +- Executive decks, QBR presentations +- Feature analysis, experiment results +- Strategy reviews, board updates +- Formatted with visuals and narratives + +See `agents/presentation/SKILL.md` for full workflow. + +### Release + +**Status:** NEW | **Trigger:** "Track [feature] release", "Analyze launch performance" + +Tracks and analyzes feature releases and product launches. Monitors adoption and impact. + +**What it tracks:** +- Adoption metrics (DAU, penetration rate) +- User segments and cohort analysis +- Retention impact, engagement lift +- Performance vs goals + +See `agents/release/SKILL.md` for full workflow. + +### User Research + +**Status:** NEW | **Trigger:** "Segment users", "Create personas", "User journey analysis" + +Conducts quantitative user research using behavioral data. Identifies segments and opportunities. + +**What it provides:** +- User segmentation and personas +- User journeys and pain points +- Power user patterns +- Churn signals and predictions + +See `agents/user-research/SKILL.md` for full workflow. + --- | Agent | Status | What it does | |-------|--------|-------------| -| Dashboard Builder | WIP | Creates feature dashboards in Hex Threads | -| Linear Issue Manager | Ready | Creates and manages Linear issues for analytics work across PA teams | | Dashboard Builder | WIP | Creates feature dashboards in Hex Threads with automated data quality validation | +| Linear Issue Manager | Ready | Creates and manages Linear issues for analytics work across PA teams | +| Data Validation | NEW | Validates data quality, detects NULLs/gaps/invalid values with severity classification | +| Context Helper (Events) | NEW | Finds and explains events from registry, code, and Figma with SQL examples | +| Presentation | NEW | Creates presentation decks from analytics findings for stakeholder reviews | +| Release | NEW | Tracks feature releases, monitors adoption, analyzes performance vs goals | +| User Research | NEW | Conducts quantitative user research, creates personas, identifies opportunities | diff --git a/agents/context-helper/SKILL.md b/agents/context-helper/SKILL.md new file mode 100644 index 0000000..02174e8 --- /dev/null +++ b/agents/context-helper/SKILL.md @@ -0,0 +1,231 @@ +--- +name: context-helper-events +description: Helps find and explain events for analytics work. Searches event registry, code, and Figma to provide complete context about feature events. +tags: [context, events, discovery, search] +--- + +# Context Helper (Events) + +## When to use + +- "What events does [feature] track?" +- "Find events for [feature]" +- "Explain event [event_name]" +- "What does this event mean?" +- "Search for events related to [keyword]" +- "Which events should I use for [analysis]?" + +## What it provides + +- **Event discovery**: Find all events for a feature or keyword +- **Event definitions**: Explain what each event tracks and when it fires +- **Event context**: When/where event is triggered in the product +- **Usage examples**: SQL patterns for using the event +- **Data availability**: Which tables contain the event +- **Event metadata**: Type (exploration, generation), source (web, backend), status (active, deprecated) + +## Steps + +### 1. Understand User Request + +Identify what the user needs: +- **Feature name**: "What events track video generation?" +- **Event name**: "Explain event 'generate_video'" +- **Use case**: "Events for retention analysis" +- **Keyword**: "Events with 'download' in the name" + +### 2. Search Event Registry + +Read `shared/event-registry.yaml` to find matching events: + +**Search strategies:** +- **By feature**: Look in feature sections (e.g., "Video Generation", "Timeline") +- **By event name**: Search for exact or partial match +- **By keyword**: Search descriptions for keywords (download, click, generate) +- **By type**: Filter by event type (exploration, generation, navigation) + +### 3. Check Additional Sources + +If event not in registry or needs more context: + +**Code search (GitHub):** +```bash +# Search for event tracking calls +gh api search/code -q "generate_video" +``` + +**Figma designs:** +- Look for feature flows showing when events fire +- Identify user interactions that trigger events + +**Documentation:** +- Check product spec docs for feature details +- Review engineering docs for event implementation + +### 4. Gather Event Details + +For each matching event, collect: +- **Event name**: Exact string used in tracking +- **Description**: What the event tracks +- **Type**: exploration, generation, navigation, action +- **Source**: web, backend, api +- **When triggered**: User action or system event that fires it +- **Available in**: Which tables (ltxstudio_user_all_actions, etc.) +- **Key fields**: Important columns to use (action_name, action_category, model_gen_type) +- **Status**: active, deprecated, planned +- **Related events**: Other events in the same flow + +### 5. Provide Usage Context + +Show how to use the event in analysis: + +**SQL example:** +```sql +-- Count video generations +SELECT COUNT(*) AS video_generations +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = 'generate_video' + AND action_category = 'generations' -- IMPORTANT: Exclude clicks + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +``` + +**Common patterns:** +- **DAU**: `COUNT(DISTINCT lt_id)` for event +- **Event count**: `COUNT(*)` or `COUNT(CASE WHEN ... THEN 1 END)` +- **Funnel analysis**: Multiple events in sequence +- **Segmentation**: Break down by `griffin_tier_name_at_action` + +### 6. Format Response + +**Structure:** +``` +# Events for [Feature/Keyword] + +## Found Events: [X] + +### 1. [event_name] +- **Description**: [What it tracks] +- **Type**: [exploration/generation/navigation] +- **Source**: [web/backend/api] +- **When triggered**: [User action that fires event] +- **Available in**: `ltxstudio_user_all_actions` +- **Status**: ✅ Active / ⚠️ Deprecated / 🔨 Planned + +**Key filters:** +- `action_name = 'event_name'` +- `action_category = 'category'` [if applicable] + +**SQL Example:** +```sql +SELECT COUNT(*) AS event_count +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` +WHERE action_name = 'event_name' + AND date(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) +``` + +**Related events:** +- [related_event_1]: [description] +- [related_event_2]: [description] + +### 2. [next_event_name] +[Same structure...] + +## Usage Notes + +**Critical filters:** +- For video generations: Must use `action_category = 'generations'` to exclude clicks +- For downloads: Use `action_name = 'download_image'` or `download_video` +- For generations: Check `model_gen_type` for sub-types (t2i, i2i, t2v, i2v, v2v) + +**Common patterns:** +- **Feature DAU**: Users who performed this event +- **Event funnel**: Sequence from exploration → generation → download +- **Adoption rate**: % of DAU who used this event + +## Recommendations + +For your analysis, consider: +1. [Specific event recommendation 1] +2. [Specific event recommendation 2] +3. [Warning or gotcha to avoid] +``` + +### 7. Proactive Suggestions + +**If user asks for feature events:** +- List ALL events for that feature (exploration, generation, navigation) +- Show typical event flow (what happens first, second, etc.) +- Warn about gotchas (e.g., must filter action_category for video gen) + +**If user asks for event explanation:** +- Provide full context (when/why it fires) +- Show related events (what comes before/after) +- Include SQL examples for common use cases + +**If events not found:** +- Suggest similar events that might work +- Offer to search code/Figma for new events +- Create placeholder in event registry if needed + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Always — primary source of truth for events | +| `shared/bq-schema.md` | Table structures and column details | +| `shared/metric-standards.md` | SQL patterns for event-based metrics | + +## Rules + +### Search Best Practices + +- **DO** always check `shared/event-registry.yaml` first +- **DO** search by feature name, event name, and keywords +- **DO** use GitHub code search if event not in registry +- **DO** check Figma for feature flows and user interactions +- **DO** verify event exists in actual data before recommending it + +### Event Documentation + +- **DO** explain what triggers the event (user action or system event) +- **DO** specify which table(s) contain the event +- **DO** include critical filters (action_category, model_gen_type) +- **DO** provide SQL examples for common use cases +- **DO** list related events in the same flow +- **DO** warn about gotchas (e.g., "must filter action_category") + +### Response Format + +- **DO** lead with summary: "Found X events for [feature]" +- **DO** organize by event type (exploration, generation, navigation) +- **DO** include status indicators (✅ Active / ⚠️ Deprecated) +- **DO** provide "Usage Notes" section with critical filters +- **DO** include "Recommendations" for specific analysis needs +- **DO NOT** invent events — only report events that exist +- **DO NOT** skip related events — show full context + +## Common Event Searches + +### By Feature: +- "Video generation events" → All events for video gen (generate_video, model_gen_type filters, etc.) +- "Timeline events" → Split clip, delete clip, update clip, link timeline +- "Storyboard events" → entered_storyboard, generate_storyboard, etc. + +### By Use Case: +- "Retention analysis" → Core engagement events (generate_image, generate_video, download) +- "Funnel analysis" → Sequential events (storyboard → generate → download) +- "Adoption analysis" → New feature events with status=active + +### By Type: +- "Generation events" → action_category = 'generations' +- "Navigation events" → event_source like '%navigation%' +- "Feature usage" → exploration category events + +## Event Discovery Workflow + +1. **Start with registry**: Check `shared/event-registry.yaml` +2. **Code search**: If not in registry, search GitHub +3. **Figma check**: Verify with product design flows +4. **Data verification**: Query to confirm event exists in tables +5. **Document findings**: Add to registry if new event found +6. **Share context**: Provide full details to user with examples diff --git a/agents/data-validation/SKILL.md b/agents/data-validation/SKILL.md new file mode 100644 index 0000000..e1e3791 --- /dev/null +++ b/agents/data-validation/SKILL.md @@ -0,0 +1,249 @@ +--- +name: data-validation +description: Validate data quality in BigQuery tables and query results. Detects NULLs, gaps, invalid values, schema mismatches, and anomalies. +tags: [validation, data-quality, testing, qa] +--- + +# Data Validation Agent + +## When to use + +- "Validate this query result" +- "Check data quality" +- "Run data quality checks" +- "Verify table data" +- "Validate schema" +- "Check for data issues" + +## What it validates + +- **NULL checks**: Identify unexpected NULL values in key columns +- **Date gaps**: Detect missing dates in time series data +- **Invalid values**: Find negative counts, invalid percentages, out-of-range values +- **Schema validation**: Verify column names and types match expectations +- **Row count checks**: Detect unexpectedly low or zero row counts +- **Duplicates**: Check for duplicate primary keys +- **Referential integrity**: Verify foreign key relationships +- **Business logic**: Check funnel violations, retention increases, impossible metrics + +## Steps + +### 1. Gather Requirements + +Ask the user: +- **What to validate**: Query result, table, or specific columns? +- **Validation type**: Full suite or specific checks (NULLs, gaps, schema)? +- **Expected schema**: What columns should exist? What are acceptable values? +- **Time range**: For time series, what dates should be present? +- **Thresholds**: Minimum row count? Maximum NULL percentage? + +### 2. Read Shared Knowledge + +Before writing validation SQL: +- **`shared/bq-schema.md`** — Expected table schemas and column types +- **`shared/metric-standards.md`** — Valid ranges for metrics (e.g., retention 0-100%) + +### 3. Design Validation Checks + +**Basic Checks:** +```sql +-- 1. Row count check +SELECT COUNT(*) AS row_count +FROM table_name +WHERE date_column >= @start_date +-- Flag if row_count = 0 + +-- 2. NULL checks (key columns should not be NULL) +SELECT + COUNTIF(lt_id IS NULL) AS null_lt_id, + COUNTIF(action_ts IS NULL) AS null_action_ts, + COUNTIF(action_name IS NULL) AS null_action_name +FROM table_name +-- Flag if any > 0 + +-- 3. Date gaps check +WITH date_range AS ( + SELECT date FROM UNNEST(GENERATE_DATE_ARRAY(@start_date, @end_date)) AS date +), +actual_dates AS ( + SELECT DISTINCT DATE(action_ts) AS date + FROM table_name + WHERE DATE(action_ts) BETWEEN @start_date AND @end_date +) +SELECT dr.date AS missing_date +FROM date_range dr +LEFT JOIN actual_dates ad ON dr.date = ad.date +WHERE ad.date IS NULL +-- Flag if any dates missing +``` + +**Advanced Checks:** +```sql +-- 4. Invalid values check +SELECT + COUNTIF(dau < 0) AS negative_dau, + COUNTIF(retention_rate < 0 OR retention_rate > 100) AS invalid_retention, + COUNTIF(churn_rate < 0 OR churn_rate > 100) AS invalid_churn +FROM metrics_table +-- Flag if any > 0 + +-- 5. Duplicate check +SELECT + primary_key, + COUNT(*) AS duplicate_count +FROM table_name +GROUP BY primary_key +HAVING COUNT(*) > 1 +-- Flag if any results + +-- 6. Schema validation +SELECT column_name, data_type +FROM `project.dataset.INFORMATION_SCHEMA.COLUMNS` +WHERE table_name = 'target_table' +-- Compare against expected schema +``` + +### 4. Execute Validation Checks + +Run all checks and collect results: +- Store pass/fail status for each check +- Capture details for failed checks (which rows, values, dates) +- Calculate severity (critical, warning, info) + +### 5. Analyze Results + +**Classify issues by severity:** + +**🔴 Critical (blocks usage):** +- Zero rows returned +- All values NULL in key columns +- Schema mismatch (missing required columns) +- Primary key duplicates + +**🟡 Warning (investigate):** +- >10% NULL values in non-nullable columns +- Date gaps (1-2 missing days) +- Invalid values in <5% of rows +- Row count significantly lower than expected + +**🟢 Info (monitor):** +- <5% NULL values in optional columns +- Expected schema variations (new columns added) +- Row count within normal variance + +### 6. Format Validation Report + +**Structure:** +``` +# Data Validation Report + +**Table/Query**: [name] +**Date Range**: [start] to [end] +**Run Date**: [timestamp] +**Overall Status**: 🟢 PASS / 🟡 WARNING / 🔴 FAIL + +## Summary +- Total Checks: X +- Passed: Y +- Failed: Z +- Warnings: W + +## Results + +### ✅ Passed Checks (Y) +1. Row count: 1,234,567 rows (expected: >100,000) +2. Primary key uniqueness: No duplicates found +3. Date coverage: All dates present [start] to [end] + +### 🔴 Failed Checks (Z) +1. **NULL values in lt_id**: 1,234 rows (0.1%) + - **Impact**: Cannot identify users for these actions + - **Action**: Investigate source data, backfill if possible + +2. **Negative DAU values**: 3 dates + - **Dates**: 2024-01-15, 2024-01-22, 2024-01-29 + - **Impact**: Metrics calculation incorrect + - **Action**: Fix upstream aggregation logic + +### 🟡 Warnings (W) +1. **Date gap detected**: Missing 2024-01-20 + - **Impact**: Week-over-week comparison affected + - **Action**: Check if data pipeline had outage + +## Recommendations +1. [Action for critical issue 1] +2. [Action for critical issue 2] +3. [Action for warning 1] + +## Query Results (if applicable) +[Show sample of validated data with issues highlighted] +``` + +### 7. Deliver Report + +- **Inline**: Show validation results immediately after query execution +- **Dashboard annotation**: Add warnings to dashboard if quality issues found +- **Slack alert**: Post to #data-quality for critical failures +- **Linear issue**: Create ticket for persistent data quality issues + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/bq-schema.md` | Expected schemas and column types | +| `shared/metric-standards.md` | Valid ranges for metrics | + +## Rules + +### Validation Best Practices + +- **DO** run validation checks before reporting results to users +- **DO** check for NULL values in key columns (lt_id, action_ts, action_name) +- **DO** verify date coverage for time series data +- **DO** validate metric ranges (0-100% for percentages, >=0 for counts) +- **DO** flag schema mismatches (missing columns, wrong types) +- **DO** check row counts against expected minimums +- **DO** detect duplicates in primary key columns + +### Severity Guidelines + +- **DO** mark as 🔴 CRITICAL if data cannot be used (zero rows, all NULLs, schema mismatch) +- **DO** mark as 🟡 WARNING if data usable but has issues (>10% NULLs, date gaps, invalid values) +- **DO** mark as 🟢 INFO if data is healthy but has minor variations +- **DO** always provide "Action" recommendations for failed checks +- **DO** include impact assessment ("Cannot identify users", "Metrics incorrect") + +### Communication + +- **DO** show validation summary upfront (✅ X passed, 🔴 Y failed, 🟡 Z warnings) +- **DO** list passed checks to build confidence in data quality +- **DO** provide specific row counts and examples for failed checks +- **DO** include recommendations for fixing issues +- **DO** link to documentation or similar issues if available +- **DO NOT** proceed with analysis if critical validation fails +- **DO NOT** hide data quality issues — always surface them prominently + +## Common Validation Patterns + +### For Time Series Data: +1. Check for complete date range (no gaps) +2. Verify minimum row count per date (not just total) +3. Validate metrics are within expected ranges +4. Check for sudden spikes/drops (Z-score > 3) + +### For User Data: +1. Verify lt_id is never NULL +2. Check for valid email formats (if applicable) +3. Validate tier values match known tiers +4. Verify first_active_ts <= last_active_ts + +### For Aggregated Metrics: +1. Check DAU <= WAU <= MAU (hierarchy violation) +2. Verify retention rates are 0-100% +3. Validate churn + retention ≈ 100% (for same cohort) +4. Check funnel metrics decrease at each step + +### For Join Results: +1. Verify no unexpected NULL values from joins +2. Check row count matches expectations (not multiplied or lost) +3. Validate foreign key relationships (all IDs exist in both tables) diff --git a/agents/presentation/SKILL.md b/agents/presentation/SKILL.md new file mode 100644 index 0000000..7fab4d1 --- /dev/null +++ b/agents/presentation/SKILL.md @@ -0,0 +1,338 @@ +--- +name: presentation +description: Creates presentation decks and slides from analytics findings. Formats data insights for stakeholder presentations, QBRs, and executive reviews. +tags: [presentation, slides, deck, stakeholders] +--- + +# Presentation Agent + +## When to use + +- "Create presentation from this analysis" +- "Turn these findings into slides" +- "Format this for stakeholder presentation" +- "Create QBR deck" +- "Make slides for leadership review" +- "Present these insights to the team" + +## What it creates + +- **Executive decks**: High-level insights for leadership +- **QBR presentations**: Quarterly business reviews for enterprise customers +- **Feature analysis**: Deep-dive presentations on feature performance +- **Experiment results**: A/B test or feature launch results +- **Strategy reviews**: Data-driven recommendations for product decisions +- **Board updates**: Key metrics and trends for board meetings + +## Steps + +### 1. Gather Requirements + +Ask the user: +- **Purpose**: QBR, feature review, board update, experiment results? +- **Audience**: Leadership, product team, customer, engineering? +- **Input**: Analysis results, dashboard, report, or raw data? +- **Length**: Quick update (5-10 slides) or comprehensive (15-20 slides)? +- **Format**: Google Slides, PowerPoint, PDF, or Markdown? +- **Brand guidelines**: Use company template or standard format? + +### 2. Understand the Analysis + +Review input data/findings: +- What are the key insights? +- What decisions need to be made? +- What are the success metrics? +- What trends or patterns emerged? +- What are the recommendations? + +### 3. Structure the Deck + +**Standard presentation structure:** + +``` +1. Title Slide + - Presentation title + - Date + - Presenter name + +2. Executive Summary (1 slide) + - 3-5 key takeaways + - Bottom line upfront + +3. Agenda (1 slide) + - Section overview + +4. Context (1-2 slides) + - Why this analysis matters + - Scope and timeframe + +5. Key Findings (3-5 slides) + - One insight per slide + - Visual + narrative + +6. Deep Dive (3-5 slides) + - Supporting data + - Segmentation analysis + +7. Trends & Patterns (1-2 slides) + - What's changing + - What to watch + +8. Recommendations (1-2 slides) + - Actionable next steps + - Prioritized by impact + +9. Appendix (optional) + - Detailed data + - Methodology + - Additional charts +``` + +### 4. Design Each Slide + +**Slide principles:** +- **One message per slide**: Clear headline that states the insight +- **Visual first**: Chart or graph takes center stage +- **Minimal text**: 3-5 bullet points max, no paragraphs +- **Actionable titles**: "DAU up 25% in Q1" not "Q1 DAU Results" +- **Annotations**: Highlight key data points with callouts +- **Consistent style**: Same colors, fonts, chart styles throughout + +**Chart selection:** +- **Line chart**: Trends over time +- **Bar chart**: Comparisons across categories +- **Pie chart**: Part-to-whole (use sparingly) +- **Table**: Detailed numbers when precision matters +- **Heatmap**: Patterns across two dimensions +- **Funnel**: Conversion flow + +### 5. Format Content for Slides + +**Transform analysis findings into slide format:** + +**From analysis:** +``` +DAU increased from 10,000 to 12,500 (+25%) in Q1 2024. +The growth was driven primarily by Pro tier users (+40%) +while Free tier remained flat. +``` + +**To slide:** +``` +Slide Title: DAU Growth Driven by Pro Tier Users + +[Line chart showing DAU trend by tier] + +Key Insights: +• Total DAU up 25% in Q1 (10K → 12.5K) +• Pro tier users grew 40% (primary driver) +• Free tier stable, opportunity for activation + +Action: Focus retention efforts on Pro tier +``` + +### 6. Add Executive Summary + +Create opening slide with key takeaways: + +**Executive Summary format:** +``` +Key Takeaways: + +1. 💚 [Positive finding with metric] + → [What this means] + +2. 🔴 [Concerning finding with metric] + → [Why it matters] + +3. 🎯 [Opportunity with metric] + → [Recommended action] + +Bottom Line: [One-sentence conclusion] +``` + +### 7. Include Recommendations + +Final slides should be actionable: + +**Recommendations format:** +``` +Recommended Actions: + +High Priority: +1. [Action 1] — [Expected impact] +2. [Action 2] — [Expected impact] + +Medium Priority: +3. [Action 3] — [Expected impact] + +Timeline: [When to implement] +Owner: [Who should drive this] +Success metrics: [How to measure] +``` + +### 8. Create Appendix + +Include supporting details: +- Detailed data tables +- Methodology notes +- Additional charts not in main deck +- Definitions and glossary +- FAQ section + +### 9. Deliver Presentation + +**Format options:** +- **Google Slides**: Collaborative, easy to share +- **PowerPoint**: Professional, works offline +- **PDF**: Read-only, preserves formatting +- **Markdown**: Quick internal presentations +- **Hex Notebook**: Interactive with live data + +**Distribution:** +- Email to stakeholders before meeting +- Upload to shared drive or Confluence +- Post in Slack channel +- Add to Linear issue as attachment + +## Presentation Templates + +### Executive QBR (10 slides) +1. Title +2. Executive Summary +3. Business Performance (usage, revenue, engagement) +4. User Growth & Retention +5. Feature Adoption +6. Customer Health (for enterprise) +7. Challenges & Risks +8. Upcoming Initiatives +9. Recommendations +10. Q&A + +### Feature Analysis (8 slides) +1. Title +2. Feature Overview (what, when launched) +3. Adoption Metrics (usage, DAU, penetration) +4. User Feedback (qualitative + quantitative) +5. Performance vs Goals +6. Segmentation Analysis (by tier, geography) +7. Key Learnings +8. Next Steps + +### Experiment Results (6 slides) +1. Title +2. Experiment Design (hypothesis, variants, metrics) +3. Results Overview (winner, key metrics) +4. Statistical Significance +5. User Segment Analysis +6. Recommendation (ship, iterate, or kill) + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/metric-standards.md` | Metric definitions for slides | +| `shared/bq-schema.md` | Data sources and methodology | + +## Rules + +### Slide Design + +- **DO** use one message per slide with clear, actionable title +- **DO** lead with visuals (chart or graph) not tables +- **DO** keep text minimal (3-5 bullets max) +- **DO** annotate charts with callouts for key insights +- **DO** use consistent colors and styles throughout +- **DO** include executive summary upfront +- **DO NOT** use paragraphs — bullet points only +- **DO NOT** overcrowd slides — white space is good +- **DO NOT** use more than 3 colors per chart + +### Content + +- **DO** lead with insights, not just data ("DAU up 25%" not "Here's DAU data") +- **DO** provide context (vs baseline, vs goal, vs prior period) +- **DO** segment data to show which users drove changes +- **DO** include "so what?" for each finding (why it matters) +- **DO** end with actionable recommendations +- **DO** include appendix for detailed data +- **DO NOT** show raw query results — format for readability +- **DO NOT** assume audience knows acronyms — define them + +### Storytelling + +- **DO** structure as narrative: context → findings → recommendations +- **DO** highlight wins (💚) and concerns (🔴) clearly +- **DO** show trends and patterns, not just snapshots +- **DO** connect findings to business goals +- **DO** propose specific next steps with owners and timelines +- **DO** frame recommendations as opportunities, not just problems + +### Formatting + +- **DO** use title case for slide headers +- **DO** include slide numbers for reference +- **DO** add source notes for data credibility +- **DO** use emojis sparingly for visual emphasis (💚🔴🎯) +- **DO** preview slides to ensure charts are readable +- **DO NOT** use font smaller than 18pt +- **DO NOT** rely on animations — slides should work as PDF + +## Common Presentation Types + +### For Leadership: +- Focus on business impact (revenue, growth, efficiency) +- High-level trends and patterns +- Strategic recommendations +- Minimal technical detail + +### For Product Team: +- Feature-level insights +- User segmentation analysis +- Adoption and engagement metrics +- Tactical next steps + +### For Customers (QBR): +- Account-specific performance +- Health scores and trends +- Value delivered (ROI) +- Roadmap alignment + +### For Engineering: +- Technical metrics (latency, errors, performance) +- Data quality issues +- Infrastructure recommendations +- Implementation details OK + +## Output Format + +Provide presentation in requested format: + +**Google Slides:** +``` +# [Slide 1 Title] +[Chart description or placeholder] +• Bullet point 1 +• Bullet point 2 + +--- + +# [Slide 2 Title] +... +``` + +**Markdown:** +```markdown +--- +# Slide 1: Title +--- + +# Slide 2: Executive Summary + +Key Takeaways: +1. Finding 1 +2. Finding 2 +3. Finding 3 + +--- +``` diff --git a/agents/release/SKILL.md b/agents/release/SKILL.md new file mode 100644 index 0000000..dc938e1 --- /dev/null +++ b/agents/release/SKILL.md @@ -0,0 +1,327 @@ +--- +name: release +description: Tracks and analyzes feature releases and product launches. Monitors adoption, performance, and impact of new releases. +tags: [release, launch, feature, adoption] +--- + +# Release Agent + +## When to use + +- "Track [feature] release" +- "Analyze [feature] launch performance" +- "How is [new feature] performing?" +- "Monitor [feature] adoption" +- "Create release report for [feature]" +- "Compare release performance" + +## What it tracks + +- **Adoption metrics**: DAU, WAU, penetration rate for new feature +- **Usage patterns**: How often users engage with feature +- **User segments**: Which tiers/cohorts adopt fastest +- **Performance vs goals**: Actual vs projected adoption +- **Cohort analysis**: New users vs existing users adoption +- **Time to adoption**: How quickly users discover and use feature +- **Retention impact**: Does feature improve retention? +- **Feedback signals**: Support tickets, feedback, bugs + +## Steps + +### 1. Gather Release Information + +Ask the user: +- **Feature name**: What was released? +- **Release date**: When did it launch? +- **Target audience**: All users or specific segments (Pro, Enterprise)? +- **Release goals**: What were adoption/usage targets? +- **Event tracking**: Which events track the feature? +- **Comparison**: Compare to other releases or baseline? + +### 2. Read Shared Knowledge + +Before analysis: +- **`shared/event-registry.yaml`** — Find feature events +- **`shared/bq-schema.md`** — Data sources for analysis +- **`shared/metric-standards.md`** — Adoption and retention metrics + +### 3. Define Success Metrics + +**Standard metrics for releases:** + +**Adoption:** +- **Feature DAU**: Daily users of the feature +- **Penetration rate**: Feature DAU / Total DAU +- **Adoption curve**: Cumulative users over time +- **Time to first use**: Days from signup to first feature use + +**Engagement:** +- **Usage frequency**: Actions per user per day/week +- **Feature sessions**: Sessions including feature use +- **Depth of use**: Advanced features used vs basic + +**Impact:** +- **Retention lift**: Retention for users who use feature vs don't +- **Engagement lift**: Overall app engagement for feature users +- **Revenue impact**: Conversion or ARPU for feature users + +### 4. Generate SQL for Each Metric + +**Adoption tracking:** +```sql +-- Daily adoption since launch +SELECT + DATE(action_ts) AS date, + COUNT(DISTINCT lt_id) AS feature_dau, + COUNT(DISTINCT lt_id) / ( + SELECT COUNT(DISTINCT lt_id) + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) = DATE(a.action_ts) + ) * 100 AS penetration_rate +FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` a +WHERE action_name IN ('feature_event_1', 'feature_event_2') + AND DATE(action_ts) >= @release_date +GROUP BY date +ORDER BY date +``` + +**Cohort adoption:** +```sql +-- Adoption by user cohort +WITH cohorts AS ( + SELECT + lt_id, + DATE(MIN(first_active_ts)) AS cohort_date + FROM `ltx-dwh-prod-processed.web.ltxstudio_users` + GROUP BY lt_id +), +feature_users AS ( + SELECT DISTINCT + lt_id, + MIN(DATE(action_ts)) AS first_use_date + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE action_name IN ('feature_event_1', 'feature_event_2') + GROUP BY lt_id +) +SELECT + CASE + WHEN c.cohort_date < @release_date THEN 'Existing Users' + ELSE 'New Users' + END AS cohort_type, + COUNT(DISTINCT f.lt_id) AS feature_users, + COUNT(DISTINCT c.lt_id) AS total_users, + SAFE_DIVIDE(COUNT(DISTINCT f.lt_id), COUNT(DISTINCT c.lt_id)) * 100 AS adoption_rate +FROM cohorts c +LEFT JOIN feature_users f ON c.lt_id = f.lt_id +GROUP BY cohort_type +``` + +**Retention impact:** +```sql +-- D7 retention: feature users vs non-users +WITH users_D0 AS ( + SELECT DISTINCT lt_id + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) = @cohort_date +), +feature_users AS ( + SELECT DISTINCT lt_id + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE action_name IN ('feature_event_1', 'feature_event_2') + AND DATE(action_ts) = @cohort_date +), +returned_D7 AS ( + SELECT DISTINCT lt_id + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) = DATE_ADD(@cohort_date, INTERVAL 7 DAY) +) +SELECT + CASE WHEN f.lt_id IS NOT NULL THEN 'Feature Users' ELSE 'Non-Feature Users' END AS user_type, + COUNT(DISTINCT u.lt_id) AS cohort_size, + COUNT(DISTINCT r.lt_id) AS retained, + SAFE_DIVIDE(COUNT(DISTINCT r.lt_id), COUNT(DISTINCT u.lt_id)) * 100 AS d7_retention +FROM users_D0 u +LEFT JOIN feature_users f ON u.lt_id = f.lt_id +LEFT JOIN returned_D7 r ON u.lt_id = r.lt_id +GROUP BY user_type +``` + +### 5. Analyze Results + +**Compare vs goals:** +- Is adoption on track? (actual vs projected) +- Which segments adopt fastest? (Pro vs Free, new vs existing) +- Is usage frequency meeting expectations? +- Is feature sticky? (repeated usage) + +**Identify issues:** +- Low adoption? → Discoverability or onboarding problem +- High bounce? → UX or value proposition issue +- Segment gaps? → Pricing, access, or relevance issue +- No retention lift? → Feature not addressing core need + +**Look for patterns:** +- Adoption curve shape (fast ramp vs slow burn) +- Day-of-week patterns (business days vs weekends) +- User journey (what comes before/after feature use) +- Correlation with other features + +### 6. Create Release Report + +**Structure:** +``` +# Release Report: [Feature Name] + +**Release Date**: [MM/DD/YYYY] +**Days Since Launch**: [X] +**Status**: 🟢 On Track / 🟡 Monitor / 🔴 Below Target + +## Executive Summary +- [Key finding 1] +- [Key finding 2] +- [Key finding 3] + +## Adoption Metrics +| Metric | Current | Goal | Status | +|--------|---------|------|--------| +| Feature DAU | X | Y | 🟢/🟡/🔴 | +| Penetration Rate | X% | Y% | 🟢/🟡/🔴 | +| Total Adopters | X | Y | 🟢/🟡/🔴 | + +## Adoption Curve +[Line chart showing daily feature DAU since launch] + +Key Insights: +- Adoption ramping [faster/slower] than typical releases +- [X]% of target audience has tried feature +- Average time to first use: [Y] days + +## User Segmentation +| Segment | Adoption Rate | vs Overall | +|---------|---------------|------------| +| Pro | X% | +Y% | +| Standard | X% | +Y% | +| Free | X% | -Y% | + +**Insight**: [Which segments lead/lag and why] + +## Engagement Patterns +- **Usage frequency**: [X] times per user per week +- **Session depth**: [Y]% of sessions include feature +- **Advanced usage**: [Z]% using advanced capabilities + +## Impact Assessment +- **Retention lift**: Feature users have [X]% higher D7 retention +- **Engagement lift**: Feature users [Y]% more active overall +- **Revenue impact**: [If applicable] + +## Feedback & Issues +- **Support tickets**: [X] tickets related to feature +- **Top issues**: [Issue 1], [Issue 2], [Issue 3] +- **User feedback**: [Qualitative summary] + +## Comparison to Other Releases +| Release | D7 Adoption | D30 Penetration | +|---------|-------------|-----------------| +| This Release | X% | Y% | +| [Comparable 1] | X% | Y% | +| [Comparable 2] | X% | Y% | + +## Recommendations + +🎯 **Immediate Actions**: +1. [Action 1] — [Why] +2. [Action 2] — [Why] + +📊 **Monitor Closely**: +- [Metric 1]: [Why] +- [Metric 2]: [Why] + +🔄 **Next Review**: [Date] +``` + +### 7. Set Up Ongoing Monitoring + +For continued tracking: +- **Daily**: Feature DAU, penetration rate +- **Weekly**: Cohort adoption, usage frequency, retention impact +- **Monthly**: Long-term trends, cumulative adopters + +**Alert conditions:** +- Adoption drops below target threshold +- Critical bugs affecting usage +- Negative feedback spike +- Competitor launches similar feature + +### 8. Deliver Report + +- **Slack**: Post summary to #product-releases or #[feature]-launch +- **Email**: Send full report to stakeholders +- **Dashboard**: Create live Hex dashboard for ongoing tracking +- **Presentation**: Format as slides for leadership review + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/event-registry.yaml` | Finding feature events | +| `shared/bq-schema.md` | User cohorts and segmentation | +| `shared/metric-standards.md` | Adoption and retention metrics | + +## Rules + +### Tracking Best Practices + +- **DO** start tracking from exact release date (not approximations) +- **DO** define success metrics BEFORE analyzing (avoid cherry-picking) +- **DO** compare to similar releases for context +- **DO** segment by user type (new vs existing, tier, cohort) +- **DO** track both adoption (did they try it) and engagement (do they use it repeatedly) +- **DO** measure impact on core metrics (retention, engagement, revenue) + +### Analysis + +- **DO** use adoption curves to visualize growth over time +- **DO** calculate penetration rate (feature DAU / total DAU) +- **DO** compare feature users vs non-users for impact assessment +- **DO** identify which segments lead adoption (early indicators) +- **DO** track time-to-adoption (how quickly users discover feature) +- **DO** look for drop-off points in feature funnel +- **DO NOT** declare success/failure too early (wait at least 2-4 weeks) +- **DO NOT** ignore segment differences (overall may mask problems) + +### Reporting + +- **DO** lead with status indicator (🟢/🟡/🔴) for quick assessment +- **DO** compare actual vs goal for all key metrics +- **DO** include adoption curve visualization +- **DO** show user segmentation breakdown +- **DO** provide actionable recommendations based on findings +- **DO** set next review date for continued monitoring +- **DO NOT** just report numbers — provide interpretation +- **DO NOT** hide bad news — flag issues early for faster response + +## Common Release Patterns + +### Fast Ramp (Good): +- High early adoption +- Strong word-of-mouth +- Clear value proposition +- Good onboarding + +### Slow Burn (Monitor): +- Gradual adoption +- Niche use case +- Requires learning +- May need marketing push + +### Spike & Drop (Concern): +- High initial trial +- Low repeat usage +- UX or value issue +- Need to investigate drop-off + +### Segment Divergence (Investigate): +- Some segments adopt, others don't +- Pricing, access, or relevance issue +- May need targeted interventions diff --git a/agents/user-research/SKILL.md b/agents/user-research/SKILL.md new file mode 100644 index 0000000..a3bb26d --- /dev/null +++ b/agents/user-research/SKILL.md @@ -0,0 +1,396 @@ +--- +name: user-research +description: Conducts quantitative user research using behavioral data. Identifies user segments, personas, pain points, and opportunities. +tags: [research, users, personas, segmentation] +--- + +# User Research Agent + +## When to use + +- "Who are our users?" +- "Create user personas" +- "Segment users by behavior" +- "Find power users" +- "Identify churned user patterns" +- "What do different user types need?" +- "User journey analysis" + +## What it provides + +- **User segmentation**: Behavioral clusters and personas +- **User journeys**: Common paths through the product +- **Pain points**: Where users struggle or drop off +- **Power user patterns**: What high-value users do differently +- **Churn signals**: Behaviors that predict churn +- **Feature affinity**: Which users use which features +- **Opportunity identification**: Unmet needs or underserved segments + +## Steps + +### 1. Define Research Question + +Ask the user: +- **Research goal**: Understand segments, find pain points, identify opportunities? +- **User scope**: All users, specific tier, cohort, or behavior? +- **Time frame**: Recent behavior or historical patterns? +- **Output**: Personas, segments, journey maps, recommendations? + +### 2. Read Shared Knowledge + +Before analysis: +- **`shared/bq-schema.md`** — User tables and segmentation queries +- **`shared/metric-standards.md`** — User metrics and segmentation patterns +- **`shared/event-registry.yaml`** — Feature events for behavior analysis + +### 3. Choose Research Approach + +**Segmentation Analysis:** +- Identify distinct user groups by behavior +- Cluster users by usage patterns +- Compare segments on key metrics + +**Journey Analysis:** +- Map common user flows +- Identify entry points and drop-offs +- Find successful vs unsuccessful paths + +**Cohort Analysis:** +- Compare user cohorts over time +- Track cohort retention and engagement +- Identify cohort-specific patterns + +**Power User Analysis:** +- Define power user criteria +- Analyze what they do differently +- Extract lessons for other users + +**Churn Analysis:** +- Identify users at risk of churning +- Find behavioral signals of churn +- Compare churned vs retained users + +### 4. Generate SQL for Analysis + +**User segmentation by behavior:** +```sql +WITH user_behavior AS ( + SELECT + lt_id, + COUNT(DISTINCT DATE(action_ts)) AS days_active_30d, + COUNT(CASE WHEN action_name = 'generate_image' THEN 1 END) AS image_gens, + COUNT(CASE WHEN action_name = 'generate_video' AND action_category = 'generations' THEN 1 END) AS video_gens, + COUNT(CASE WHEN action_name LIKE 'download%' THEN 1 END) AS downloads + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + GROUP BY lt_id +) +SELECT + CASE + WHEN days_active_30d >= 20 AND video_gens > 50 THEN 'Power User - Video' + WHEN days_active_30d >= 20 AND image_gens > 100 THEN 'Power User - Image' + WHEN days_active_30d >= 10 THEN 'Regular User' + WHEN days_active_30d >= 3 THEN 'Casual User' + ELSE 'Inactive User' + END AS user_segment, + COUNT(*) AS user_count, + AVG(image_gens) AS avg_image_gens, + AVG(video_gens) AS avg_video_gens, + AVG(downloads) AS avg_downloads +FROM user_behavior +GROUP BY user_segment +ORDER BY user_count DESC +``` + +**User journey analysis:** +```sql +-- Common action sequences +WITH user_actions AS ( + SELECT + lt_id, + action_name, + action_ts, + ROW_NUMBER() OVER (PARTITION BY lt_id ORDER BY action_ts) AS action_seq + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) +) +SELECT + a1.action_name AS first_action, + a2.action_name AS second_action, + a3.action_name AS third_action, + COUNT(DISTINCT a1.lt_id) AS user_count +FROM user_actions a1 +LEFT JOIN user_actions a2 ON a1.lt_id = a2.lt_id AND a2.action_seq = 2 +LEFT JOIN user_actions a3 ON a1.lt_id = a3.lt_id AND a3.action_seq = 3 +WHERE a1.action_seq = 1 +GROUP BY first_action, second_action, third_action +ORDER BY user_count DESC +LIMIT 20 +``` + +**Power user identification:** +```sql +-- Define power users and analyze their behavior +WITH power_users AS ( + SELECT + lt_id, + COUNT(DISTINCT DATE(action_ts)) AS days_active, + COUNT(*) AS total_actions + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + GROUP BY lt_id + HAVING days_active >= 20 AND total_actions >= 500 -- Top 5-10% criteria +), +feature_usage AS ( + SELECT + a.lt_id, + SUM(CASE WHEN action_name = 'generate_video' AND action_category = 'generations' THEN 1 ELSE 0 END) AS video_gens, + SUM(CASE WHEN action_name LIKE '%timeline%' THEN 1 ELSE 0 END) AS timeline_actions, + SUM(CASE WHEN action_name LIKE '%storyboard%' THEN 1 ELSE 0 END) AS storyboard_actions + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` a + INNER JOIN power_users p ON a.lt_id = p.lt_id + WHERE DATE(a.action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + GROUP BY a.lt_id +) +SELECT + COUNTIF(video_gens > 0) / COUNT(*) * 100 AS pct_use_video, + COUNTIF(timeline_actions > 0) / COUNT(*) * 100 AS pct_use_timeline, + COUNTIF(storyboard_actions > 0) / COUNT(*) * 100 AS pct_use_storyboard, + AVG(video_gens) AS avg_video_gens +FROM feature_usage +``` + +**Churn prediction:** +```sql +-- Identify users showing churn signals +WITH user_activity AS ( + SELECT + lt_id, + MAX(DATE(action_ts)) AS last_active_date, + COUNT(DISTINCT DATE(action_ts)) AS days_active_30d + FROM `ltx-dwh-prod-processed.web.ltxstudio_user_all_actions` + WHERE DATE(action_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY) + GROUP BY lt_id +), +user_tier AS ( + SELECT + lt_id, + griffin_tier_name, + is_paid + FROM `ltx-dwh-prod-processed.web.ltxstudio_users` +) +SELECT + t.griffin_tier_name, + CASE + WHEN a.last_active_date IS NULL THEN 'Churned (30d+ inactive)' + WHEN DATE_DIFF(CURRENT_DATE(), a.last_active_date, DAY) >= 14 THEN 'At Risk (14d+ inactive)' + WHEN a.days_active_30d <= 2 THEN 'At Risk (Low engagement)' + ELSE 'Healthy' + END AS churn_risk, + COUNT(*) AS user_count +FROM user_tier t +LEFT JOIN user_activity a ON t.lt_id = a.lt_id +WHERE t.is_paid = TRUE -- Focus on paid users +GROUP BY t.griffin_tier_name, churn_risk +ORDER BY t.griffin_tier_name, churn_risk +``` + +### 5. Analyze Findings + +**For segmentation:** +- How many segments emerge? +- What defines each segment? (behavior, features used, engagement) +- Which segments are most valuable? +- Which segments have highest churn? + +**For journeys:** +- What are common entry points? +- Where do users drop off? +- What paths lead to success (retention, conversion)? +- What paths lead to churn? + +**For power users:** +- What do they do differently? +- Which features do they use more? +- What's their typical workflow? +- How quickly did they become power users? + +**For churn:** +- What behaviors predict churn? +- How early can we detect churn risk? +- Which segments churn most? +- What's different about retained users? + +### 6. Create User Personas + +**Persona structure:** +``` +# Persona: [Name] (e.g., "Power Video Creator") + +## Profile +- **Segment size**: X users (Y% of base) +- **Tier**: Primarily [Pro/Standard/etc.] +- **Cohort**: [New users / Established users] + +## Behavior Patterns +- **Activity**: Active [X] days per month +- **Core actions**: [Primary feature usage] +- **Usage frequency**: [Y] sessions per week +- **Favorite features**: [Feature 1], [Feature 2], [Feature 3] + +## Goals & Needs +- [What they're trying to achieve] +- [Pain points they face] +- [What would make them more successful] + +## Engagement Metrics +- **D7 Retention**: X% +- **Avg Session Length**: Y minutes +- **Feature Adoption**: Z% of features used + +## Opportunities +- [How to better serve this segment] +- [Features they'd benefit from] +- [Potential for growth or upsell] +``` + +### 7. Format Research Report + +**Structure:** +``` +# User Research Report: [Topic] + +**Research Period**: [Date range] +**User Sample**: [X users analyzed] +**Research Goal**: [What we wanted to learn] + +## Executive Summary +Key findings: +1. [Finding 1] +2. [Finding 2] +3. [Finding 3] + +## User Segments + +We identified [X] distinct user segments: + +### Segment 1: [Name] +- **Size**: [X users (Y%)] +- **Defining behavior**: [Key characteristics] +- **Value**: [Revenue, retention, engagement] +- **Opportunities**: [How to grow or serve better] + +[Repeat for each segment...] + +## User Journeys + +Common paths through the product: + +### Journey 1: [Journey name] +1. [Entry point] → +2. [Action 2] → +3. [Action 3] → +4. [Outcome] + +**Success rate**: X% +**Drop-off points**: [Where users leave] +**Opportunities**: [How to improve] + +## Pain Points + +Based on behavioral data: +1. **[Pain point 1]**: [Evidence from data] +2. **[Pain point 2]**: [Evidence from data] +3. **[Pain point 3]**: [Evidence from data] + +## Power User Insights + +What high-value users do differently: +- [Behavior 1]: Power users do this X% more +- [Behavior 2]: Power users adopt this feature Y% faster +- [Behavior 3]: Power users have Z% higher retention + +**Lessons**: [How to help other users become power users] + +## Churn Analysis + +Users at risk of churning: +- **At-risk users**: [X users showing churn signals] +- **Churn indicators**: [Behaviors that predict churn] +- **Intervention opportunities**: [When/how to retain them] + +## Recommendations + +### High Priority +1. **[Recommendation 1]** — [Target segment]: [Expected impact] +2. **[Recommendation 2]** — [Target segment]: [Expected impact] + +### Medium Priority +3. **[Recommendation 3]** — [Target segment]: [Expected impact] + +### Research Follow-ups +- [Additional research needed] +- [Questions to explore further] + +## Appendix +- Methodology notes +- Detailed segment definitions +- SQL queries used +``` + +### 8. Deliver Research + +- **Product team**: Share personas and journey maps +- **Engineering**: Share technical insights (performance, errors) +- **Design**: Share pain points and UX opportunities +- **Marketing**: Share segment profiles for targeting +- **Customer Success**: Share churn signals and interventions + +## Reference Files + +| File | Read when | +|------|-----------| +| `shared/bq-schema.md` | User segmentation queries and tables | +| `shared/metric-standards.md` | User metrics and segmentation patterns | +| `shared/event-registry.yaml` | Feature events for behavior analysis | + +## Rules + +### Research Best Practices + +- **DO** start with clear research questions +- **DO** use representative sample sizes (not just outliers) +- **DO** look at behavior over time (not just snapshots) +- **DO** segment by multiple dimensions (tier, cohort, geography) +- **DO** validate findings across different time periods +- **DO** combine quantitative data with qualitative context + +### Analysis + +- **DO** identify meaningful segments (not just arbitrary splits) +- **DO** compare segments on key metrics (retention, engagement, revenue) +- **DO** look for causal patterns (what leads to what) +- **DO** identify both successful and unsuccessful user journeys +- **DO** find what power users do differently (actionable insights) +- **DO NOT** confuse correlation with causation +- **DO NOT** over-segment (too many segments = not actionable) + +### Personas + +- **DO** base personas on actual behavioral data (not assumptions) +- **DO** make personas specific and actionable +- **DO** include metrics (retention, engagement, revenue) +- **DO** identify persona needs and pain points +- **DO** show how to serve each persona better +- **DO NOT** create too many personas (3-5 is ideal) +- **DO NOT** create personas that are all the same + +### Reporting + +- **DO** lead with executive summary (key findings upfront) +- **DO** visualize segments and journeys (not just tables) +- **DO** provide actionable recommendations for each finding +- **DO** prioritize recommendations by impact and effort +- **DO** suggest follow-up research questions +- **DO NOT** just report data — provide interpretation +- **DO NOT** overwhelm with detail — use appendix for deep dives