Skip to content

groupcache: populate hotCache based on reported peer QPS#179

Open
mertovun wants to merge 2 commits into
golang:masterfrom
mertovun:hotcache-qps
Open

groupcache: populate hotCache based on reported peer QPS#179
mertovun wants to merge 2 commits into
golang:masterfrom
mertovun:hotcache-qps

Conversation

@mertovun

Copy link
Copy Markdown

Summary

getFromPeer currently mirrors a remotely fetched value into the hotCache on a fixed 10% of fetches, chosen at random. This was a placeholder — the code carried a TODO(bradfitz): use res.MinuteQps or something smart — and it is a poor hotness signal:

  • it pollutes the hotCache with one-off keys, spending a budget that is deliberately capped at ~1/8 of the cache; and
  • it takes ~10 round trips on average before a genuinely hot key is mirrored, which is exactly the owner-side load the hotCache exists to shed.

This change makes the value's owner the authority on hotness, which is the only process that can actually observe global demand for a key it owns.

How

  • The owner tracks a per-key request rate using an exponentially weighted moving average (keyStats, half-life of one minute) and reports it in the GetResponse.MinuteQps field — which the wire format already carried but nobody populated.
  • A non-owning peer mirrors a key into its hotCache only once the owner reports a rate at or above hotQPS.
  • The per-key rate state lives inside the mainCache entry, so it is bounded by cache residency and cleaned up by ordinary eviction; there is no separate map to grow or evict.
  • lru.Peek (first commit) lets the rate be read/updated without disturbing LRU recency.

The test-only *rand.Rand hook (added in #175) is replaced with an injectable clock, which TestPeers and the new stats tests use to drive QPS deterministically. TestPeers now exercises the real path: cold peers cause no mirroring, and peers reporting hotQPS cause peer-owned keys to be mirrored and subsequently served from the hotCache with zero peer hits.

Compatibility

No wire-format change; MinuteQps already exists in GetResponse. A peer running older code reports 0 and simply never triggers mirroring, so the change is safe in a mixed-version fleet.

Tests

go build ./..., go vet ./..., and go test -race ./... all pass.


I'm happy to sign the Google CLA if the CLA bot flags this.

🤖 Generated with Claude Code

@google-cla

google-cla Bot commented Jun 16, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

mertovun added 2 commits June 16, 2026 17:58
Peek looks up a key's value without updating its recency, unlike Get
which moves the entry to the front of the LRU list. This is useful for
inspecting an entry (for example, to read per-key statistics) without
affecting eviction order.
Previously, getFromPeer mirrored a remotely fetched value into the
hotCache on a fixed 10% of fetches, chosen at random. This was a
placeholder (noted in a TODO) and is a poor signal: it pollutes the
hotCache with one-off keys while taking ~10 round trips on average to
mirror a genuinely hot key, defeating the purpose of the hotCache.

Use the value's owner as the authority on hotness instead. The owner
tracks a per-key request rate using an exponentially weighted moving
average and reports it in the GetResponse.MinuteQps field (which the
wire format already carried but nobody populated). A non-owning peer
mirrors a key into its hotCache only once the owner reports a rate at
or above hotQPS.

The per-key rate state lives inside the mainCache entry, so it is
bounded by cache residency and cleaned up by ordinary eviction, with
no separate accounting to evict.

This replaces the test-only *rand.Rand hook (added in golang#175) with an
injectable clock, which TestPeers and the new stats tests use to drive
QPS deterministically.
@mertovun

Copy link
Copy Markdown
Author

I signed it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant