API: add opt-in per-token auth identity cache#1719
Open
MrDirkelz wants to merge 3 commits into
Open
Conversation
Every REST request through AuthGuard re-runs the full identity pipeline (RS256 verify + 3-4 Mango user lookups + a lastLogin write + accessMap projection). The socket handshake amortizes this per connection; the REST path does not, so a busy authenticated client hammers resolveIdentity and the DB. Add IdentityCacheService, a transport-agnostic cache in front of AuthIdentityService.resolveOrDefault, used by both the REST guard and the socket handshake. It caches the resolved membership + identity keyed by a sha256 of the token and re-derives the accessMap live from PermissionSystem on each hit, so an ACL (groupUpdate) change needs no invalidation. Design notes: - TTL = min(token exp, IDENTITY_CACHE_TTL_MS): a hit never serves a token past expiry; a different/expired token misses and is fully verified. - Bounded by a maxEntries cap + lazy/sweep eviction, no timers (BoundedTtlCache, reusing the strikeLimiter pattern). - Invalidate on User memberOf removal (PermissionChange DeleteCmd), AuthProvider/AutoGroupMappings changes, and DB disconnect. Not on groupUpdate (accessMap is re-projected) or plain User updates, so lastLogin writes never evict. - lastLogin now updates once per TTL per instance instead of per request. - Ships disabled (IDENTITY_CACHE_ENABLED=false); enable per environment. - Keeps all auth resolution behind one seam, easing a future SSE transport. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Collaborator
Author
|
@claude review this PR |
Follow-up to the opt-in identity cache, addressing review findings: - Group additions no longer lag: DbService.upsertDoc emits a write-side `permissionChange` event on any non-Group memberOf change (add or remove); IdentityCacheService clears on it for User docs. Pure additions produce no DeleteCmd, so the change feed alone couldn't distinguish them from a lastLogin-style update — the cache keeps ignoring plain User updates, so lastLogin writes never churn it. - Close the user-deletion revocation gap: the Deleted DeleteCmd handler now also covers DocType.User, so a deleted user's still-valid JWT stops resolving from cache immediately instead of after the TTL. - Make the clock consistent: IdentityCacheService takes one injectable `now`, shared with BoundedTtlCache and used in computeTtlMs (was Date.now() directly). - Note in .env.example that membership/deletion invalidate promptly and residual staleness (IdP revocation, provider reassignment) is TTL-bounded. Adds tests for permissionChange (User vs non-User), the Deleted-User revocation, and injected-clock TTL/expiry. lastLogin write cadence is intentionally left unchanged (write-on-miss) and decoupled from invalidation, pending a separate decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow-up to the opt-in identity cache, addressing review findings: - Group additions no longer lag: DbService.upsertDoc emits a write-side `permissionChange` event on any non-Group memberOf change (add or remove); IdentityCacheService clears on it for User docs. A pure addition produces no DeleteCmd, so the change feed alone couldn't distinguish it from a lastLogin-style update — the cache keeps ignoring plain User updates, so lastLogin writes never churn it. - Close the user-deletion revocation gap: the Deleted DeleteCmd handler now also covers DocType.User, so a deleted user's still-valid JWT stops resolving from cache immediately instead of after the TTL. - Make the clock consistent: IdentityCacheService takes one injectable `now`, shared with BoundedTtlCache and used in computeTtlMs (was Date.now() directly). Marked `@Optional()` so Nest DI leaves it unresolved and the Date.now default applies — without it the provider fails to instantiate (app boot and every DB-backed test module). - Note in .env.example that membership/deletion invalidate promptly and residual staleness (IdP revocation, provider reassignment) is TTL-bounded. Adds tests for permissionChange (User vs non-User), the Deleted-User revocation, and injected-clock TTL/expiry. lastLogin write cadence is intentionally left unchanged (write-on-miss) and decoupled from invalidation, pending a separate decision. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Every REST request through AuthGuard re-runs the full identity pipeline (RS256 verify + 3-4 Mango user lookups + a lastLogin write + accessMap projection). The socket handshake amortizes this per connection; the REST path does not, so a busy authenticated client hammers resolveIdentity and the DB.
Add IdentityCacheService, a transport-agnostic cache in front of AuthIdentityService.resolveOrDefault, used by both the REST guard and the socket handshake. It caches the resolved membership + identity keyed by a sha256 of the token and re-derives the accessMap live from PermissionSystem on each hit, so an ACL (groupUpdate) change needs no invalidation.
Design notes:
NOTE ON THE FOLLOWING DIAGRAM:
The accessMap is not cached in the current code, this part of the plan is just if we decide we want to deduplicate accessMaps that can be the same like super-admin access
flowchart TB A(["/query arrives with token"]) --> C{"IDENTITY<br>lookup token?"} C -- if found --> D["got mergedGroups<br>skip verify + DB + lastLogin"] C -- else --> E["resolveIdentity:<br>verify, DB lookups, lastLogin<br>store in identity cache under token"] D --> F["key2 = hash sorted mergedGroups"] E --> F F --> G{"Access map caching<br>lookup key2?"} G -- if found --> H["got shared accessMap<br>skip projection"] G -- else --> I["getAccessMap mergedGroups<br>store under key2<br>link key1 → key2"] H --> J["QueryService injects filters<br>run query"] I --> J J --> K(["return results"]) E1["event: User.memberOf change"] --> A1@{ label: "Evict that identity's Layer I entry<br>NOTE: ignore lastLogin-only changes" } E2["event: AutoGroupMappings /<br>AuthProvider change"] --> A2["Full flush of Identity Cache"] E3["event: groupUpdate"] --> A3["Recompute each distinct projection<br>diff vs cached, update in place<br>linked tokens auto-update"] E4["event: DB disconnect"] --> A4["Clear BOTH caches"] A1 --> L1[("Identity Cache<br>token → membership")] A2 --> L1 A4 --> L1 & L2[("Access map cache<br>membership-set → accessMap")] A3 --> L2 A1@{ shape: rect}