[ADH-8124] Add S3 Gateway ozone objects cache by tigrulya-exe · Pull Request #10 · arenadata/ozone

tigrulya-exe · 2026-04-30T21:10:05Z

Added caching OM client for S3 gateway component. Currently, it caches metadata for S3 volume, buckets, keys (both regular and extended versions) in local per-user Caffeine caches. Cache entries are evicted on corresponding write operations (delete, rename) or by TTL.

Related options:

Key	Description	Default Value
`ozone.s3g.object.cache.enabled`	Enable or disable the S3Gateway object cache.	`false`
`ozone.s3g.object.cache.metrics.enabled`	Enable or disable metrics for the S3Gateway object cache.	`true`
`ozone.s3g.object.cache.volume.size`	Maximum number of entries in the S3Gateway volume cache.	`10`
`ozone.s3g.object.cache.volume.ttl.ms`	TTL in milliseconds for entries in the S3Gateway volume cache.	`3000`
`ozone.s3g.object.cache.bucket.size`	Maximum number of entries in the S3Gateway bucket cache.	`100`
`ozone.s3g.object.cache.bucket.ttl.ms`	TTL in milliseconds for entries in the S3Gateway bucket cache.	`3000`
`ozone.s3g.object.cache.key.size`	Maximum number of entries in the S3Gateway key cache.	`1000`
`ozone.s3g.object.cache.key.ttl.ms`	TTL in milliseconds for entries in the S3Gateway key cache.	`3000`

Also a Hadoop Metrics2 source is added to collect and publish the following statistics for all S3 object caches (s3VolumeCache, s3BucketCache, s3HeadKeyCache, s3KeyInfoCache):

Name	Description
`hitCount`	Cache hit count
`hitRate`	Cache hit rate
`missCount`	Cache miss count
`loadSuccessCount`	Successful cache loads
`loadFailureCount`	Failed cache loads
`loadFailureRate`	Failed cache loads rate
`evictionCount`	Cache eviction count
`averageLoadPenalty`	Average load penalty (ns)

Example of metrics if Prometheus MetricsSink is used:

# TYPE s3_object_cache_metrics_miss_count gauge
s3_object_cache_metrics_miss_count{cachename="s3BucketCache",hostname="0dfbab4b336f"} 27
s3_object_cache_metrics_miss_count{cachename="s3HeadKeyCache",hostname="0dfbab4b336f"} 10
s3_object_cache_metrics_miss_count{cachename="s3KeyInfoCache",hostname="0dfbab4b336f"} 5
s3_object_cache_metrics_miss_count{cachename="s3VolumeCache",hostname="0dfbab4b336f"} 30

iamlapa

Also, BucketEndpoint::handleGetRequest(S3RequestContext context, String bucketName)

ozoneKeyIterator = bucket.listKeys(prefix, prevKey, shallow);

This path still ignores the S3 max-keys value when choosing how much to fetch from OM; the iterator falls back to ozone.client.list.cache, which is 1000 by default, so max-keys=1 can still over-fetch badly. We should add an S3-specific list path that passes a backend fetch size like min(maxKeys + 1, configuredCap).

iamlapa · 2026-05-04T07:25:10Z

+    }
+    S3ObjectCacheKey that = (S3ObjectCacheKey) o;
+    // deliberately use only the immutable part of the S3Auth
+    return Objects.equals(s3Auth.getAccessID(), that.s3Auth.getAccessID());


Before we let enable this, we need a real auth gate in front of metadata cache hits. Right now the cache key only uses the access ID, so a warmed entry can be reused by a later request with the same access ID but a bad stringToSign/signature, and OM never gets a chance to validate SigV4. Either validate SigV4 locally in S3G with a secure short-lived S3 secret cache, or do an auth-only OM check before serving cached metadata, but cached metadata cannot be returned just because the access ID matches.

Tbh, I think the second suggested option will conflict with the main goal of this cache — reducing the number of calls to OM. Regarding the first one - cache itself is a kind of trade-off between latency and consistency, because it won't be strictly consistent in any case (changes can be made via the ozone CLI or other clients, even if we add a subscription mechanism, it will receive events only eventually, not immediately). In some cases this cache can return even non-existent buckets/keys, so in my opinion a small chance of returning an entry for an invalid stringToSign/signature during short key TTL isn't worth adding extra layers of complexity, like a local S3 secret cache or an additional OM call.

iamlapa · 2026-05-04T07:26:53Z

+  }
+
+  @Override
+  public void deleteKey(String volumeName, String bucketName, String keyName, boolean recursive) throws IOException {


this only covers part of the write surface. Before the cache can be enabled, key metadata needs to be evicted for createKey*, rewriteKey*, completeMultipartUpload, putObjectTagging, deleteKey(s), and renameKey(s) for both source and destination. For stream writes, I’d also evict before opening the write and again after successful close/commit, so we never keep serving an old ETag or metadata while a replacement object is being written.

Makes sense, but currently this methods aren't used in S3G and therefore I decided not to wrap them. However, future releases can use them/internal logic can change, so I added implementation for these methods

iamlapa · 2026-05-04T07:31:22Z

+  }
+
+  @Override
+  public void deleteBucket(String volumeName, String bucketName) throws IOException {


bucket details are cached, but this only evicts on deleteBucket. We also need invalidation after createBucket, setBucketOwner, setBucketVersioning, setBucketStorageType, setBucketQuota, setReplicationConfig, setEncryptionKey, and setAcl; otherwise owner checks, acl decisions, default replication, encryption, and quota-related behavior can be stale until TTL.

iamlapa · 2026-05-04T07:38:04Z

+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.client;


this puts S3 Gateway implementation classes into the org.apache.hadoop.ozone.client package so they can call package-private OzoneClientFactory hooks from the client module. That makes an internal client-module helper an implicit cross-module API. I’d rather expose a small intentional extension point from the client module and keep the S3-specific builder/cache under org.apache.hadoop.ozone.s3.*; otherwise future client refactors or upstream merges can break S3G in a pretty non-obvious way.

Exposed the method in OzoneClientFactory and made it a bit more readable

tigrulya-exe · 2026-05-04T20:25:58Z

Also, BucketEndpoint::handleGetRequest(S3RequestContext context, String bucketName)
ozoneKeyIterator = bucket.listKeys(prefix, prevKey, shallow);
This path still ignores the S3 max-keys value when choosing how much to fetch from OM; the iterator falls back to ozone.client.list.cache, which is 1000 by default, so max-keys=1 can still over-fetch badly. We should add an S3-specific list path that passes a backend fetch size like min(maxKeys + 1, configuredCap).

Yeah, it was already discussed with Alexander and we decided to skip this requirement for now and check if further work is needed in a separate task. Unfortunately, we can't simply pass the value of max-keys down to OM due to some internal logic, see HDDS-12882

tigrulya-exe requested review from Asmoday, iamlapa and werzerbb April 30, 2026 21:10

tigrulya-exe force-pushed the feature/ADH-8124-2.2.0 branch from f306dde to f844ae5 Compare April 30, 2026 21:16

iamlapa reviewed May 4, 2026

View reviewed changes

[ADH-8124] Add S3 Gateway ozone objects cache

e96d57d

tigrulya-exe force-pushed the feature/ADH-8124-2.2.0 branch from f844ae5 to e96d57d Compare May 4, 2026 20:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADH-8124] Add S3 Gateway ozone objects cache#10

[ADH-8124] Add S3 Gateway ozone objects cache#10
tigrulya-exe wants to merge 1 commit into2.2.0-developfrom
feature/ADH-8124-2.2.0

tigrulya-exe commented Apr 30, 2026

Uh oh!

iamlapa left a comment

Uh oh!

iamlapa May 4, 2026

Uh oh!

tigrulya-exe May 4, 2026 •

edited

Loading

Uh oh!

iamlapa May 4, 2026

Uh oh!

tigrulya-exe May 4, 2026 •

edited

Loading

Uh oh!

iamlapa May 4, 2026

Uh oh!

iamlapa May 4, 2026

Uh oh!

tigrulya-exe May 4, 2026 •

edited

Loading

Uh oh!

tigrulya-exe commented May 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tigrulya-exe commented Apr 30, 2026

Uh oh!

iamlapa left a comment

Choose a reason for hiding this comment

Uh oh!

iamlapa May 4, 2026

Choose a reason for hiding this comment

Uh oh!

tigrulya-exe May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamlapa May 4, 2026

Choose a reason for hiding this comment

Uh oh!

tigrulya-exe May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iamlapa May 4, 2026

Choose a reason for hiding this comment

Uh oh!

iamlapa May 4, 2026

Choose a reason for hiding this comment

Uh oh!

tigrulya-exe May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tigrulya-exe commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tigrulya-exe May 4, 2026 •

edited

Loading

tigrulya-exe May 4, 2026 •

edited

Loading

tigrulya-exe May 4, 2026 •

edited

Loading

tigrulya-exe commented May 4, 2026 •

edited

Loading