Allow optionally passed string allocator and expand benchmarks by jashook · Pull Request #264 · maxmind/MaxMind-DB-Reader-dotnet

jashook · 2026-02-25T15:17:07Z

Change also includes an example string intern method which drops allocations nearly 50%.

Method	Mean	Error	StdDev	Gen0	Allocated
CityMemoryMappedLookup	6.526 ms	0.1058 ms	0.0883 ms	468.7500	3875.97 KB
CityMemoryMappedCachedLookup	2.539 ms	0.0328 ms	0.0307 ms	66.4063	552.98 KB
CityMemoryLookup	6.538 ms	0.0891 ms	0.0875 ms	468.7500	3875.97 KB
CityMemoryCachedLookup	2.510 ms	0.0147 ms	0.0130 ms	66.4063	552.98 KB

jashook · 2026-02-25T18:08:31Z

cc @horgh

horgh

Thanks for the PR! I had a few comments. For future changes, it would probably make sense to open an issue to talk about the design and whether we would be likely to accept a change before going through the effort of writing a PR.

horgh · 2026-02-25T20:53:07Z

        /// <param name="file">The file.</param>
-        public Reader(string file) : this(file, FileAccessMode.MemoryMapped)
+        /// <param name="stringAllocator">Optional allocator method for strings created from byte arrays.</param>
+        public Reader(string file, AllocatorDelegates.GetString? stringAllocator = null)


I think we'd need to figure out how callers would use this without bugs. Design wise I'm not sure it's something we'd want as is.

For example, Reader needs to be thread safe, so the allocator would need to be as well. That seems like something easy to misuse. e.g. the benchmark use seems to not be thread safe.

As well, we might need to have a way to limit the size of the cache, e.g. evict older entries or something.

Potentially ideally we would do something similar to the caching in the Java reader: https://github.com/maxmind/MaxMind-DB-Reader-java?tab=readme-ov-file#caching. That caches based on the data section offset I believe, so we could cache more than only strings.

I think we'd need to figure out how callers would use this without bugs. Design wise I'm not sure it's something we'd want as is.

A bit of this is me learning what this library does. That said, noted, can open issues, the issue here is more or less clear, the library allocates a bit.

For example, Reader needs to be thread safe, so the allocator would need to be as well. That seems like something easy to misuse. e.g. the benchmark use seems to not be thread safe.

+1 moved the cache into the library. I left the cache internal, and added a new constructor passing in a cache capacity.

Potentially ideally we would do something similar to the caching in the Java reader: https://github.com/maxmind/MaxMind-DB-Reader-java?tab=readme-ov-file#caching. That caches based on the data section offset I believe, so we could cache more than only strings.

I am not particularly married to a specific design here. Please give feedback on the current change, and I have no problem making changes. This has not been time intense.

horgh · 2026-03-11T18:46:53Z

                        throw new InvalidOperationException($"{dbPathVarName} was not set");
-        _reader = new Reader(dbPath);
+        _memMapReader = new Reader(dbPath, FileAccessMode.MemoryMapped);
+        _memMapCachedReader = new Reader(dbPath, FileAccessMode.MemoryMapped, 4_096);


It may make sense to have a separate PR for improvements to the benchmark tool, if they are unrelated to the PR. I realize in a prior version the changes were more relevant as you were exercising new parameters to the reader.

There will still need to a be change to exercise the new parameters, but it will make the pr easier to review. Thank you

horgh · 2026-03-11T18:47:38Z

  </ItemGroup>

+  <ItemGroup>
+    <PackageReference Include="System.Memory" Version="4.5.5" />


Do we need this?

horgh · 2026-03-11T18:50:25Z

+        /// <param name="file">The MaxMind DB file.</param>
+        /// <param name="mode">The mode by which to access the DB file.</param>
+        /// <param name="cacheSize">Cache size for optional internal cache</param>
+        public Reader(string file, FileAccessMode mode, int cacheSize)


This seems like an improvement in terms of not breaking the API. However I wonder if we would want to provide some way to swap out the cache implementation. I'm not sure and not sure what it would best look like if so.

horgh · 2026-03-11T18:50:46Z

+        /// Cache size
+        /// </summary>
+        /// <returns></returns>
+        public int CacheSize()


Maybe we could leave this out unless it's necessary, to reduce adding to the public API for now.

horgh · 2026-03-11T18:59:43Z

+            Network? network
+            )
+        {
+            ValueTuple<object, long> returnValue = DecodeFromCacheOrCreate(offset, size, expectedType, type, injectables, network, static (Buffer database, long offset, int size, Type type, ObjectType objectType, Decoder decoder, InjectableValues? injectableValues, Network? network) =>


I wonder if it would be simpler if we only cached by data section offset / pointer. You can see we actually have caching by that already for when iterating the tree in FindAll():

MaxMind-DB-Reader-dotnet/MaxMind.Db/Reader.cs

Line 297 in 31eb3d2

if (!dataCache.TryGetValue(node.Pointer, out var data))

We cache by that in MaxMind-DB-Reader-java as well.

It may make sense to extend that caching too so we don't have two kinds of caches. I didn't realize we had that cache when you first made your PR.

horgh

Sorry, I saw a few more things!

horgh · 2026-03-11T19:11:40Z

        {
            var type = CtrlData(offset, out var size, out offset);
-            return DecodeByType(expectedType, type, offset, size, out outOffset, injectables, network);
+            return DecodeByTypeFromCacheOrCreate(expectedType, type, offset, size, out outOffset, injectables, network);


I think we should also have a test looking up different IPs that resolve to the same offset where we use Inject. It looks like we may cache the first IP for example if we inject IP, which doesn't seem desirable. Same issue with Network.

horgh · 2026-03-11T19:13:20Z

+
+        // Else we can add. Below will most likely end up as a tail call. Do not
+        // mark the method as aggressive inline.
+        return this._Cache.TryAdd((offset, size, type), item);


Would the size be incorrectly bumped if the key ends up existing already?

horgh · 2026-03-11T19:14:56Z

            ReflectionUtil.CheckType(expectedType, typeof(byte[]));

-            return _database.Read(offset, size);
+            return (byte[])DecodeFromCacheOrCreate(offset, size, expectedType, static (Buffer database, long offset, int size) =>


These lower level DecodeFromCacheOrCreate calls seem to cause us to bump size a second time. Once here, and once in the call to DecodeByType.

horgh · 2026-03-11T19:17:00Z

+    public bool TryGet(long offset, int size, Type type, out ValueTuple<object, long> returnValue)
+    {
+        // Read, attempt to return a cached value
+        if (this._Cache.TryGetValue((offset, size, type), out returnValue))


Would the returned type be modified if the caller modified it? We would need to document how to handle that at least.

Split benchmark changes out of #264

horgh requested changes Feb 25, 2026

View reviewed changes

jashook added 9 commits March 1, 2026 21:07

Allow optionally passed string allocator and expand benchmarks

6eb1ba3

Remove comments

912eab9

More small cleanup

521c907

Fix build error on windows

f91c397

Fix build part 2.

92adc2c

More broken builds....

7109e4b

Painful windows build

eaf4ecd

Take pr feedback

decb60c

Make the cache intense

25ed1f4

jashook force-pushed the allow_optional_string_allocator branch from 4139fe6 to 25ed1f4 Compare March 2, 2026 05:13

jashook added 4 commits March 1, 2026 21:17

Cleanup

fb6835f

Comment cleanup

ad33f8c

Fix copy paste issue

1046f57

Expose cache size

2afe0f8

horgh requested changes Mar 11, 2026

View reviewed changes

jashook added a commit to jashook/MaxMind-DB-Reader-dotnet that referenced this pull request Mar 25, 2026

Split benchmark changes out of maxmind#264

c2935ae

jashook added a commit to jashook/MaxMind-DB-Reader-dotnet that referenced this pull request Mar 25, 2026

Split benchmark changes out of maxmind#264

fa796e4

Merge branch 'main' into allow_optional_string_allocator

960a04e

horgh added a commit that referenced this pull request Apr 23, 2026

Merge pull request #279 from jashook/jshook/benchmark_changes

e52ca8f

Split benchmark changes out of #264

Conversation

jashook commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jashook commented Feb 25, 2026

Uh oh!

horgh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

horgh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

jashook commented Feb 25, 2026 •

edited

Loading