Skip to content

Memcache-based Buffer-Object for Global Link-Target-Caching#475

Draft
drfho wants to merge 13 commits intofb_refactor_repofrom
fb_sharedbuffer
Draft

Memcache-based Buffer-Object for Global Link-Target-Caching#475
drfho wants to merge 13 commits intofb_refactor_repofrom
fb_sharedbuffer

Conversation

@drfho
Copy link
Copy Markdown
Contributor

@drfho drfho commented Mar 27, 2026

To implement a shared in-memory buffer for link data in ZMS that persists across requests and users, switching to a Memcached or RAMCache approach may be a significant performance opportunity.

Why Memcache/RAMCache Improves Performance Over ReqBuff and ZMSIndex

  1. Cross-Request Persistence: Unlike ReqBuff (which is cleared after every HTTP request), a shared buffer allows expensive visibility and metadata lookups (like isVisible, isActive, getPhysicalPath) to be reused across different users and sessions.

  2. Avoids ZODB/Catalog Overhead: While ZMSIndex is fast, it still involves querying a persistent ZCatalog. An in-memory cache (especially Memcached) is several orders of magnitude faster for frequent "key -> blob" lookups.

  3. Handling "Invisible" Content: the ZMSIndex doesn't inherently know about isVisible. By caching the result of these checks in a shared buffer, you avoid re-evaluating complex logic (like translations, commit status, and trashcan checks) on every render.

Implementation Suggestion

We can extend the ReqBuff mixin logic in _cachemanager.py to include a "Global" or "Distributed" mode.

Below is how to modify getLinkObj in _zreferableitem.py:543 to utilize a shared cache manager if configured:

# In Products/zms/_zreferableitem.py

def getLinkObj(self, url, REQUEST=None):
    # ... (existing internal link checking logic) ...
    
    # 1. Try Request Buffer (Fastest for current thread)
    reqBuffId = 'getLinkObj.%s' % url
    ob = self.fetchReqBuff(reqBuffId)
    if ob: return ob

    # 2. Try Shared Cache (Memcache / RAMCache)
    # This assumes a "shared_cache" utility exists or is fetched via get_cache_manager
    shared_cache = self.get_cache_manager('zms_link_cache') 
    if shared_cache:
        cached_data = shared_cache.get(url) 
        if cached_data:
            # Note: You might store a dict of metadata {path, isVisible, meta_id} 
            # rather than the ZODB object itself to avoid persistence issues.
            return self.unrestrictedTraverse(cached_data['path'])

    # 3. Fallback to Catalog/Traverse
    # ... (existing lookup logic) ...
    
    # 4. Store back to Shared Cache
    if ob and shared_cache:
        shared_cache.set(url, {
            'path': '/'.join(ob.getPhysicalPath()),
            'isVisible': ob.isVisible(request)
        })

Challenges to Consider

Cache Invalidation: biggest hurdle. When an object is moved, renamed, or its visibility changes (e.g., it is deactivated), the shared cache must be invalidated. We would need to hook into Manage_afterAdd, Manage_afterClone, and onChangeObjEvt.

Object Pickling: We cannot easily store live ZODB objects in Memcached, but store "Value Objects" (dictionaries containing the path, UID, and visibility status) and resolve the object via unrestrictedTraverse when needed.

Visibility Context: isVisible can depend on the current user's roles or the selected language. Your cache key might need to be f"{url}:{lang}:{user_roles}".

Configuration

To activate the shared buffer, you need to set a ZMS configuration property:

  • Key: ZMS.cache.path
  • Value: The physical path to MemCacheZCacheManager (e.g., /my_zms_site/my_ram_cache).

drfho added 4 commits March 26, 2026 21:38
remoteFiles → get_modelfileset_from_disk
localFiles → get_modelfileset_from_zodb
read_remote_file → get_file_from_disk
readRepository → get_models_from_disk
parse_artefact  → parse_modelfile
read_artefact → read_file_from_disk
init_artefacts → create_modelfileset
@drfho
Copy link
Copy Markdown
Contributor Author

drfho commented Mar 27, 2026

A test-page with about 70 LinkElement objects in a local test with about 1000 Contentnodes gets about 20% faster (0.90s to .65s)

image

@zmsdev: I added log-writing for the ReqBuff-cache accesses, too.

value = getattr(buff, reqBuffId)
standard.writeLog(None, 'ReqBuff: Fetched key "%s" from request buffer...' % key)
return value

Here more than 1000x fetching ZMSMetaobjManager and MultiLanguageManager.
buffer.txt

@drfho drfho marked this pull request as draft March 27, 2026 18:46
@drfho
Copy link
Copy Markdown
Contributor Author

drfho commented Mar 27, 2026

2a7cedb

The link-data are cached for each ZMS-node individually. Endpoint ./getSharedBuffJSON shows a list the link-cache of the current zms-node for testing purpose.

image

@drfho
Copy link
Copy Markdown
Contributor Author

drfho commented Mar 30, 2026

Performance: Memcache may save 1ms per Link

To avoid link-target redundancy the the backlink-object-pathes are cached for the entire portal. Change: 864a2e5

Discussion @zmsdev: At this dev-stage the shared-cache contains only the object-path of the link-target but not the object itself. For this the traversing is still needed, especially because the resulting values (e.g. url, visibility etc.) are determined by the object's context. To cache a jsonified version of the object may not be complete!?
Memcache may be faster in getting the object-path compared to zctalalog but with request-buffering the traversing is not needed. Maybe there a some more ideas how to abbreviate the object-traversal!?
Otherwise we need A/B test with real-live data to get a picture about the efficiency of the approach.

Left: Cached linktarget-list ist created by user1 by requesting link-rendering. Right: user2 needs not to create the list again but gets it from memcache.
image

Left: memcache-derived objects-paths. Right: zmsindex-derived object-paths.
image

Result: About 70 links create about 800 getLinkObj-calls and with 200 index-calls consuming in sum about 50ms; at the moment only these 50ms may be saved via memcache if the target-object is still cached by request-buffer. The time consumed by the getLinkObj-function is about the same in both-scenarions (about 75ms). Because GUI/frontend of the link-rich page needs about 1000ms the saving is not much.

Snippet: Output performance data in a ZMSTextarea/Code-Block

<pre style="white-space:pre-wrap !important">
counter_zmsindex_requests = <dtml-var "REQUEST.get('counter_zmsindex_requests')">
time_consumed_by_zmsindex_requests_in_ms = <dtml-var "REQUEST.get('time_consumed_by_zmsindex_requests_in_ms')">

counter_getlinkobj = <dtml-var "REQUEST.get('counter_getlinkobj')">
time_consumed_by_getlinkobj_in_ms = <dtml-var "REQUEST.get('time_consumed_by_getlinkobj_in_ms')">
time_consumed_by_getlinkobj_datalist = <dtml-var "REQUEST.get('time_consumed_by_getlinkobj_datalist')">
</pre>

BTW: All-time favorite getMetaObj()

ZMSMetaobjManager.getMetaObj() provides the content-model and thus is extemly often relevant: the upper text-page has about 1400 calls, which consume just 10ms (suppose ZODB-cache is quite efficient).

Variant of ZMSMetaobjManager.getMetaObj() with added measure-blocks

    def getMetaobj(self, id, aq_attrs=[]):
      # ///////////////////////////////////////////////////
      # MEASURING PERFORMANCE:
      # For summing up the time needed find object by URL via ZMSIndex:
      # add the current time to the request-parameter 'total_time_consumed_by_zmsindex_requests'/int (in ms)
      request = self.REQUEST
      start_getmetaobj = time.time()
      # ///////////////////////////////////////////////////
      ob = standard.nvl( self.__get_metaobj__(id), {'id': id, 'attrs': [], })
      if ob.get('acquired'):
        for k in aq_attrs:
          v = self.get_conf_property('%s.%s'%(id, k), None)
          if v is not None:
            ob[k] = v
      # ///////////////////////////////////////////////////
      # MEASURING PERFORMANCE:
      # For summing up the time needed find object by URL via ZMSIndex:
      # add the current time to the request-parameter 'total_time_consumed_by_zmsindex_requests'/int (in ms)
      request.set('time_consumed_by_getmetaobj_in_ms', round(float(request.get('time_consumed_by_getmetaobj_in_ms', 0)) + float((time.time() - start_getmetaobj)*1000),2))
      request.set('counter_getmetaobj', int(request.get('counter_getmetaobj', 0)) + 1)
      # ///////////////////////////////////////////////////
      return ob

@drfho
Copy link
Copy Markdown
Contributor Author

drfho commented Mar 30, 2026

Load-Test: 200 Requests, 12 concurrent, 1 instance

Request addresses a page containing 70 link-objects with two REST-API-variants get_child_nodes, get_body_content (each 100 requests). The cache-type does not have a significant effect on the latencey:

1. ReqBuffer

image

2. ShareBuffer / Memcache

image

Load-Test: 200 Requests, 24 concurrent, 1 instance

Depending on request concurrency the latency is rising - as expected

image

@zmsdev zmsdev force-pushed the fb_refactor_repo branch from 090a863 to 5770bd3 Compare March 31, 2026 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant