Replace caching prethread with simpler request context copy-over by cdrini · Pull Request #12414 · internetarchive/openlibrary

cdrini · 2026-04-20T07:56:04Z

avoids the complicated closure capture
avoids cache leaking when new request context variables are introduced
fixes cache leak with is_bot and is_recognized_bot
fixes various other cache leaks where the caching prethread method was used without the corresponding variables being added to the cache key.

Closes #

Technical

Testing

Screenshot

Stakeholders

- avoids the complicated closure capture - avoids cache leaking when new request context variables are introduced - fixes cache leak with is_bot and is_recognized_bot

Copilot

Pull request overview

Refactors memcache async-refresh behavior to avoid the previous caching_prethread closure approach by carrying request context explicitly into cache keys and (intended) background refresh threads, preventing cache leakage across language/bot/cookie contexts.

Changes:

Add RequestContextVars.copy_for_cache_thread() and allow set_context_from_legacy_web_py() to accept an override context.
Extend memcache_memoize with cache_request_context support; update key computation + async refresh plumbing accordingly.
Update multiple call sites/tests to the new caching approach and new cache helper APIs (memcache_delete_by_args, new memcache_set/memcache_get signatures).

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
openlibrary/utils/request_context.py	Adds cache-thread context copy helper and override support for legacy web.py context setup.
openlibrary/core/cache.py	Reworks memcache memoization keys/async refresh to optionally include request context and forward it to refresh threads.
openlibrary/tests/core/test_cache.py	Updates tests for new memoize signatures/behavior and default hashing.
openlibrary/plugins/upstream/utils.py	Switches cached macro rendering to new `cache_request_context` approach.
openlibrary/plugins/upstream/tests/test_utils.py	Removes `caching_prethread` patching now that it’s no longer used.
openlibrary/plugins/upstream/mybooks.py	Updates feed caching to use `cache_request_context` instead of `caching_prethread`.
openlibrary/plugins/upstream/borrow.py	Updates cache invalidation calls to `memcache_delete_by_args`.
openlibrary/plugins/openlibrary/home.py	Removes `caching_prethread` and uses `cache_request_context` for homepage/carousel-related caching.
openlibrary/plugins/openlibrary/api.py	Updates OPDS homepage caching to use `cache_request_context` instead of `caching_prethread`.
openlibrary/core/lending.py	Updates cache rehydration to use new `compute_key`/`memcache_set` signatures.
openlibrary/core/admin.py	Updates caching wrapper for unique login counts to new memoize options.

Copilot · 2026-04-20T08:40:11Z


+    def copy_for_cache_thread(self):
+        """Create a copy of the context vars for use in a new thread."""
+        # Everything the same, except x_forwarded_for and user_agent are see


Typo in comment: “are see” should be “are set”.

Suggested change

# Everything the same, except x_forwarded_for and user_agent are see

# Everything the same, except x_forwarded_for and user_agent are set

Copilot · 2026-04-20T08:40:11Z

    Extracts context from the global web.ctx and populates ContextVars.
    """
+    if req_context_override is not None:
+        req_context.set(req_context_override)


When req_context_override is provided, this returns after only setting req_context. For cache refresh threads, this leaves request_context.site unset (so request_context.site.get() can raise LookupError) and doesn’t synchronize legacy web.ctx fields like web.ctx.lang which some templates/helpers still read. Consider calling setup_site() here and mirroring the relevant web.ctx attributes (e.g., lang, is_bot) from the override.

Suggested change

req_context.set(req_context_override)

req_context.set(req_context_override)

setup_site()

web.ctx.lang = req_context_override.lang

web.ctx.is_bot = req_context_override.is_bot

Copilot · 2026-04-20T08:40:12Z

-                self.prethread()
+            if background_req_context:
+                delegate.fakeload()
+                set_context_from_legacy_web_py(background_req_context)


In the async refresh thread, delegate.fakeload() is called but the legacy web.ctx fields aren’t restored from background_req_context (only req_context is set). Some cached codepaths/templates still read web.ctx.lang (e.g. homepage carousels via get_lang()), so the refresh can recompute the wrong-language value under a context-specific cache key. Suggest explicitly setting web.ctx.lang / web.ctx.is_bot (and initializing request_context.site) from background_req_context before calling self.update() (or have set_context_from_legacy_web_py do that when given an override).

Suggested change

set_context_from_legacy_web_py(background_req_context)

set_context_from_legacy_web_py(background_req_context)

web.ctx.lang = getattr(background_req_context, "lang", None)

web.ctx.is_bot = getattr(background_req_context, "is_bot", False)

req_context.site = getattr(background_req_context, "site", None)

Copilot · 2026-04-20T08:40:12Z

-        prethread=caching_prethread(),
+        "logins_since",
+        timeout=60 * 60 * 12,  # 12 hours
+        cache_request_context=True,


get_unique_logins_since doesn’t appear to depend on request-specific context (lang/pd/sfw/bot), so setting cache_request_context=True will unnecessarily shard the cache per request context and reduce hit rate. Consider leaving it as the default (False) unless there’s a concrete dependency on req_context/web.ctx during async refresh.

Suggested change

cache_request_context=True,

cdrini force-pushed the refactor/replace-prethread-with-req-context-copy branch 3 times, most recently from e0b7241 to 01fe6ad Compare April 20, 2026 08:02

Replace caching prethread with simpler request context copy-over

c05f1a8

- avoids the complicated closure capture - avoids cache leaking when new request context variables are introduced - fixes cache leak with is_bot and is_recognized_bot

cdrini force-pushed the refactor/replace-prethread-with-req-context-copy branch from 01fe6ad to c05f1a8 Compare April 20, 2026 08:06

cdrini requested a review from Copilot April 20, 2026 08:32

Copilot started reviewing on behalf of cdrini April 20, 2026 08:32 View session

Copilot AI reviewed Apr 20, 2026

View reviewed changes

cdrini added the On Testing label Apr 20, 2026

mekarpeles mentioned this pull request Apr 30, 2026

Avoid web.cookies() for pd and sfw checks #12491

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replace caching prethread with simpler request context copy-over#12414

Replace caching prethread with simpler request context copy-over#12414
cdrini wants to merge 1 commit intointernetarchive:masterfrom
cdrini:refactor/replace-prethread-with-req-context-copy

cdrini commented Apr 20, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Copilot AI Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	# Everything the same, except x_forwarded_for and user_agent are see
	# Everything the same, except x_forwarded_for and user_agent are set

-        req_context.set(req_context_override)
+        req_context.set(req_context_override)
+        setup_site()
+        web.ctx.lang = req_context_override.lang
+        web.ctx.is_bot = req_context_override.is_bot

-                set_context_from_legacy_web_py(background_req_context)
+                set_context_from_legacy_web_py(background_req_context)
+                web.ctx.lang = getattr(background_req_context, "lang", None)
+                web.ctx.is_bot = getattr(background_req_context, "is_bot", False)
+                req_context.site = getattr(background_req_context, "site", None)

Uh oh!

Conversation

cdrini commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Technical

Testing

Screenshot

Stakeholders

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cdrini commented Apr 20, 2026 •

edited

Loading