Skip to content

docs(report): align Advanced SQL section with the implemented queries#44

Merged
aykhan019 merged 1 commit into
aykhan019:mainfrom
endorphin13:docs/report-align-with-implementation
May 31, 2026
Merged

docs(report): align Advanced SQL section with the implemented queries#44
aykhan019 merged 1 commit into
aykhan019:mainfrom
endorphin13:docs/report-align-with-implementation

Conversation

@endorphin13

Copy link
Copy Markdown
Contributor

Why

The final report's Section 5 (Advanced SQL Queries — the 3%-weighted criterion) printed SQL that diverged from the live API in apps/api/src/modules/analytics/analytics.service.ts. Because graders inspect the source during Demo Day, the report and the code need to agree.

What changed

Rewrote Q1–Q6 to be transcriptions of the queries that actually run:

Query Was (report) Now (matches code)
Q1 Top artists RANK(), 30-day window, role='primary', pct_of_plays DENSE_RANK(), all-time, no role filter, HAVING COUNT(*)>1
Q2 Heatmap 90-day window no window + tracks.hidden_at filter
Q3 Hidden gems per-user anti-join + preview filter global WHERE lh.id IS NULL anti-join, minPlaylistCount, random sampling
Q4 Discover seeds = all played tracks single top-track cohort, cooccurrence_count, NOT EXISTS, random sampling
Q5 Trending single-scan COUNT(*) FILTER delta two-CTE recent/prior growth-ratio with threshold
Q6 Curated picks RANK() query that did not exist real Jaccard similarPlaylists query behind "More like this"

Also corrected:

  • intro "Five representative queries" → six
  • Figure 9 caption ("labelled with the seed track" → annotated with co-occurrence count)
  • Figure 11 caption (dropped the bogus "(Query 6)")
  • row-count total label ("13 application tables" → "all 13 tables"; six are catalog tables)
  • "deterministic seed scripts" wording (the history/playlist scripts are randomized)

Docs-only change; final-report/** is prettier-ignored so formatting is unaffected.

Section 5 printed SQL that no longer matched the live API, which is a risk
for a graded report where the source is inspected during the demo. Rewrite
Q1-Q6 to mirror apps/api/src/modules/analytics/analytics.service.ts:

- Q1 top artists: DENSE_RANK (not RANK), no 30-day window, no role='primary'
  filter, HAVING COUNT(*) > 1; drop the pct_of_plays/distinct_tracks columns.
- Q2 heatmap: drop the 90-day window the code never applied; add the
  tracks.hidden_at join.
- Q3 hidden gems: global "never played by anyone" anti-join (LEFT JOIN ...
  WHERE lh.id IS NULL), artist via albums.primary_artist_id, configurable
  minPlaylistCount, random sampling; drop the per-user/preview-only framing.
- Q4 discover: single top-track cohort (not all played tracks),
  cooccurrence_count, NOT EXISTS exclusion, random sampling.
- Q5 trending: two-CTE recent/prior growth-ratio with a threshold, not a
  single-scan COUNT(*) FILTER delta.
- Q6: replace the fabricated "curated picks" RANK query (no such SQL exists)
  with the real Jaccard similar-playlists query behind the "More like this"
  rail.

Also fix the "Five representative queries" count (six), the Figure 9
"seed track" caption, the Figure 11 "(Query 6)" reference, the row-count
"13 application tables" label (six are catalog tables), and the
"deterministic seed scripts" wording.
@endorphin13 endorphin13 requested a review from aykhan019 as a code owner May 31, 2026 17:53
@vercel

vercel Bot commented May 31, 2026

Copy link
Copy Markdown

@fateh-mammadli is attempting to deploy a commit to the Aykhan's projects Team on Vercel.

A member of the Team first needs to authorize it.

@aykhan019 aykhan019 merged commit 2e83fea into aykhan019:main May 31, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants