-
Notifications
You must be signed in to change notification settings - Fork 9
perf(monitor): reduce per-tuple work in hot path #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Say you have a jobs table with 1M rows, 4 named priority ranges, and 2 queues. Rather than: - `GROUP BY (CASE ... END) AS priority, queue, COUNT(*)` (1M rows) This does: - [outer] `GROUP BY (CASE ... END) AS priority, queue, SUM(count)` (~8 rows) - [inner] `GROUP BY priority, queue, COUNT(*)` (1M rows) The latter _looks_ more complex, but it benefits from pulling `priority` directly out of the index without computing the `CASE ... END` 1 million times. (Instead, it performs the more complex `CASE` evaluation on a much, much smaller set!) In practice, I am seeing this shave ~2-3 seconds off of `delayed:monitor` queries in a production table with ~50M rows, but of course your mileage may vary!
b56ffed to
7c9d2ab
Compare
| -> Sort (cost=...) | ||
| Output: delayed_jobs.priority, delayed_jobs.queue | ||
| Sort Key: delayed_jobs.priority, delayed_jobs.queue | ||
| -> Index Scan using delayed_jobs_priority on public.delayed_jobs (cost=...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Several of these cases cover the behavior of these new queries against the "legacy" index that shipped prior to 2.0.0
Interestingly, this was previously a seq scan but now decides to use the (legacy) delayed_jobs_priority index in the inner loop!
| Group Key: delayed_jobs.priority, delayed_jobs.queue | ||
| -> Sort (cost=...) | ||
| Output: delayed_jobs.priority, delayed_jobs.queue | ||
| Sort Key: delayed_jobs.priority, delayed_jobs.queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a real-world context, I do not see this "Sort", because PG chooses HashAggregate (which does not require pre-sorted data). Here, I disabled hash aggregates within the test suite purely to force determinism.
That said, looking up at the plan for failed_count, it doesn't actually apply a sort even with a GroupAggregate:
-> GroupAggregate (cost=...)
Output: delayed_jobs.priority, delayed_jobs.queue, count(*)
Group Key: delayed_jobs.priority, delayed_jobs.queue
-> Index Only Scan using idx_delayed_jobs_failed on public.delayed_jobs (cost=...)
Output: delayed_jobs.priority, delayed_jobs.queue
This is because idx_delayed_jobs_failed arrives pre-sorted with priority, queue, whereas idx_delayed_jobs_live is sorted with priority, run_at, queue, and therefore requires another sort before the GroupAggregate. If I were to force this GroupAggregate strategy in production, the Sort would be extremely inefficient and spill to disk, so there is something to be said for rethinking column order in our indexes, even if PG is smart enough to avoid this for larger tables.
And FWIW The reason I did not put queue first in the indexes is because it's possible to run variants of these queries that do not filter by queue, which are broadly speaking unable to make use of such an index ordering (some exceptions made in newer PG versions, but I wasn't observing this in real-world testing). So you would either need a second index, or would need to decide up front what type of index to build, potentially requiring a rebuild if you decide to change your worker/monitor queue assignments.
TL;DR though, outside of the scope of this particular PR, I am strongly considering adjustments to the idx_delayed_jobs_live column order alongside a somewhat more opinionated pickup strategy (to avoid the problem of needing to decide up front whether or not your workers/monitor will support filtering on queue).
Say you have a jobs table with 1M rows, 4 named priority ranges, and 2 queues.
Rather than:
GROUP BY (CASE ... END) AS priority, queue, COUNT(*)(1M rows)This does:
GROUP BY (CASE ... END) AS priority, queue, SUM(count)(~8 rows)GROUP BY priority, queue, COUNT(*)(1M rows)The latter looks more complex, but it benefits from pulling
prioritydirectly out of the index without computing theCASE ... END1 million times. (Instead, it performs the more complexCASEevaluation on a much, much smaller set!)In practice, I am seeing this shave ~2-3 seconds off of each
delayed:monitorquery in a production table with 10M+ rows. For tables with far fewer rows, the difference is negligible. (And, as always, your mileage may vary!)/no-platform