Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions docs/adr/0082-lazy-published-cache-primitive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Unify embedded-data caches onto a lock-free publication primitive

**Date:** 2026-06-28
**Area:** `engine`
**Issue:** [#894](https://github.com/frostney/GocciaScript/issues/894)

The engine lazily loads several immutable embedded-data tables — Unicode property ranges, RegExp case-fold and non-Unicode-uppercase pairs, the IANA time-zone and CLDR resource blobs, and the available-locale list — and then reads them concurrently from the regex/lexer worker-thread pool. [#813](https://github.com/frostney/GocciaScript/issues/813) converted the RegExp case-fold/uppercase accessor from "lock on every read" to a double-checked lock-free publication: the `Loaded` flag is published last behind a `WriteBarrier` and read behind a matching `ReadBarrier`, so a reader that observes `Loaded = True` also observes the fully-written table on weakly-ordered targets (AArch64). That was the first memory-barrier use in the engine and was kept deliberately local.

The sibling caches shared the same lazy-init shape **without** the barrier discipline, in three different and individually-incorrect or costly ways:

- `Goccia.Temporal.TimeZoneData` and `Goccia.Intl.CLDRData` read and wrote their `Cached…Resource` / `Cached…ResourceLoaded` globals with **no synchronization at all** — both a cold-load data race and a publication gap on weakly-ordered CPUs.
- `Goccia.RegExp.UnicodeData`'s own `TryReadEmbeddedResource` (the UCD blob) and `Goccia.Identifier` / `IntlLocaleResolver` (identifier ID_Start/ID_Continue ranges and the available-locale list) were **correct but always-locked**: they entered a `TRTLCriticalSection` and copied the cached dynamic array out on *every* call, including the lexer's per-non-ASCII-code-point identifier hot path.

## Decision

Introduce one generic record, `TLazyPublishedCache<T>` in `source/shared/LazyPublishedCache.pas`, that owns the lazy one-shot load plus barrier-correct lock-free publication for an immutable value of any type `T`. It bundles `Data`, the `Loaded`/`Available` flags and the `Lock` into a single record, and exposes `Init`/`Done` (lock lifecycle) and `Ensure(const AKey; const ALoader): Boolean`. `Ensure` runs the cold load once under the lock, publishes `Loaded` last behind a `WriteBarrier`, and serves the warm path lock-free behind a `ReadBarrier`; load failure is memoized (`Loaded = True, Available = False`) so an absent or corrupt resource is not re-attempted. Each consumer supplies a small unit-level loader `function(const AKey; out AData: T): Boolean` and reads `Cache.Data` in place via a `const` argument, which makes no managed copy.

All eight embedded-data caches across five units now consume it:

| Unit | Caches | Was |
|------|--------|-----|
| `Goccia.Temporal.TimeZoneData` | TZ resource blob | unsynchronized |
| `Goccia.Intl.CLDRData` | CLDR resource blob | unsynchronized |
| `Goccia.RegExp.UnicodeData` | UCD blob; case-fold pairs; non-Unicode-uppercase pairs | lock-only + hand-rolled barrier DCL (#813) |
| `Goccia.Identifier` | ID_Start ranges; ID_Continue ranges | always-locked, copies out |
| `IntlLocaleResolver` | available-locale list | always-locked, copies out |

The barrier discipline now lives in exactly one place instead of being re-asserted (or omitted) per call site, a mismatched `(data, flag, lock)` pairing is a compile-time error, and the warm-path zero-copy read that #813 gave the case-fold tables now also covers the lexer's identifier hot path.

## Rejected alternatives

- **Per-unit duplication of the idiom (status quo extended).** Apply the #813 barrier DCL in-place to each cache. Correct, but re-asserts the ordering at every site, keeps the loose `(array, flag, lock)` globals, and was exactly what this issue set out to remove.
- **Two-phase record API with no callback** (`TryWarm` / `BeginColdLoad` / `Publish`). Avoids the loader function pointer, but only *partly* centralizes the discipline: every one of the eight sites must still order the phases correctly, so the barrier rope stays in the callers.
- **A concrete `TBytes` resource-buffer helper plus separate handling for the derived tables.** Removes the most duplicate concrete code (the three near-identical resource readers) but introduces two abstractions and still leaves the pair/range/locale caches without a shared publication primitive — not the single primitive the issue targeted.

## Consequences

- Behavior-preserving: the full JavaScript suite passes in both interpreter and bytecode modes, and the affected RegExp/Intl/Temporal areas pass 100%. The publication ordering is the only semantic change.
- TimeZone and CLDR resource loads now take a one-shot cold-load lock (closing the data race) and memoize load failure (previously they retried the resource lookup on every miss).
- FPC lowers the barriers to `dmb ishld` / `dmb ishst` on AArch64 and to compiler barriers on x86 (TSO), so publication is correct on both without a per-read lock.
- A weak-memory publication race is non-deterministic and not reproducible by a JavaScript test, so the primitive's functional contract (load-once, memoize-failure, data-visible-iff-available, payload-agnostic) is locked in by the Pascal gate `LazyPublishedCache.Test`, matching the [thread-local cleanup](0078-thread-local-cleanup-registry.md) precedent of Pascal gates for concurrency infrastructure.
- Audit (per the issue): `Goccia.Temporal.TimeZone`'s `Cached…` tables are `threadvar` (per-thread, no cross-thread publication) and the ICU `EnsureLoaded` / `WindowsICULoadAttempted` paths are FFI library-binding inits rather than immutable data tables; both are a different shape and stay out of scope.
1 change: 1 addition & 0 deletions docs/adr/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,4 @@ Durable architecture and implementation decisions for GocciaScript. New ADRs use
- [0079 — Keep speculatively-scanned tokens across parenthesized-group probes](0079-keep-speculatively-scanned-tokens.md)
- [0080 — FormatDouble first-hit precision scan](0080-formatdouble-first-hit-precision-scan.md)
- [0081 — Reject shared value caches as a runtime optimization](0081-reject-value-caches-for-allocation-reduction.md)
- [0082 — Unify embedded-data caches onto a lock-free publication primitive](0082-lazy-published-cache-primitive.md)
67 changes: 33 additions & 34 deletions source/shared/IntlLocaleResolver.pas
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,12 @@ implementation

BCP47,
IntlICU,
LazyPublishedCache,

Goccia.Intl.CLDRData;

var
AvailableLocalesCache: IntlTypes.TStringArray;
AvailableLocalesLoaded: Boolean;
AvailableLocalesLock: TRTLCriticalSection;
AvailableLocalesCache: TLazyPublishedCache<IntlTypes.TStringArray>;

const
CLDR_REGIONAL_AVAILABLE_LOCALES: array[0..8] of string = (
Expand Down Expand Up @@ -98,41 +97,43 @@ function LocaleWithoutUnicodeExtension(const ALocale: string): string; overload;
Result := LocaleWithoutUnicodeExtension(Parsed);
end;

// ECMA-402 ES2026 supportedLocalesOf constructors use [[AvailableLocales]].
function AvailableLocaleList: IntlTypes.TStringArray;
function LoadAvailableLocales(const AKey: string;
out ALocales: IntlTypes.TStringArray): Boolean;
var
Available: IntlTypes.TStringArray;
Canonical: string;
I, Count: Integer;
begin
EnterCriticalSection(AvailableLocalesLock);
try
if not AvailableLocalesLoaded then
SetLength(ALocales, 0);
if TryICUGetAvailableLocales(Available) then
begin
Count := 0;
SetLength(ALocales, Length(Available));
for I := 0 to High(Available) do
begin
SetLength(AvailableLocalesCache, 0);
if TryICUGetAvailableLocales(Available) then
begin
Count := 0;
SetLength(AvailableLocalesCache, Length(Available));
for I := 0 to High(Available) do
begin
Canonical := CanonicalizeUnicodeLocaleId(Available[I]);
AppendUniqueLocale(AvailableLocalesCache, Count, Canonical);
end;
for I := Low(CLDR_REGIONAL_AVAILABLE_LOCALES) to
High(CLDR_REGIONAL_AVAILABLE_LOCALES) do
AppendUniqueLocale(AvailableLocalesCache, Count,
CLDR_REGIONAL_AVAILABLE_LOCALES[I]);
AppendUniqueLocale(AvailableLocalesCache, Count, DefaultLocale);
SetLength(AvailableLocalesCache, Count);
end;
AvailableLocalesLoaded := True;
Canonical := CanonicalizeUnicodeLocaleId(Available[I]);
AppendUniqueLocale(ALocales, Count, Canonical);
end;

Result := AvailableLocalesCache;
finally
LeaveCriticalSection(AvailableLocalesLock);
for I := Low(CLDR_REGIONAL_AVAILABLE_LOCALES) to
High(CLDR_REGIONAL_AVAILABLE_LOCALES) do
AppendUniqueLocale(ALocales, Count,
CLDR_REGIONAL_AVAILABLE_LOCALES[I]);
AppendUniqueLocale(ALocales, Count, DefaultLocale);
SetLength(ALocales, Count);
end;
// The available-locale list is authoritative even when empty: if ICU is
// unavailable the engine resolves against an empty set and does not retry, so
// the load always publishes a usable value (preserving the pre-#894
// memoize-on-first-call behavior, where the loaded flag was set
// unconditionally and the cached list returned as-is).
Result := True;
end;

// ECMA-402 ES2026 supportedLocalesOf constructors use [[AvailableLocales]].
function AvailableLocaleList: IntlTypes.TStringArray;
begin
AvailableLocalesCache.Ensure('', @LoadAvailableLocales);
Result := AvailableLocalesCache.Data;
end;

function SplitBySeparator(const AValue: string; const ASeparator: Char): IntlTypes.TStringArray;
Expand Down Expand Up @@ -590,11 +591,9 @@ function DefaultLocale: string;
end;

initialization
InitCriticalSection(AvailableLocalesLock);
AvailableLocalesLoaded := False;
AvailableLocalesCache.Init;

finalization
DoneCriticalSection(AvailableLocalesLock);
SetLength(AvailableLocalesCache, 0);
AvailableLocalesCache.Done;

end.
162 changes: 162 additions & 0 deletions source/shared/LazyPublishedCache.Test.pas
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
program LazyPublishedCache.Test;

{$I Shared.inc}

uses
SysUtils,

LazyPublishedCache,
TestingPascalLibrary;

type
TIntArray = array of Integer;

var
GBytesLoads: Integer;
GIntLoads: Integer;
GFailLoads: Integer;
GLastKey: string;

function LoadBytesOk(const AKey: string; out AData: TBytes): Boolean;
begin
Inc(GBytesLoads);
GLastKey := AKey;
SetLength(AData, 3);
AData[0] := 10;
AData[1] := 20;
AData[2] := 30;
Result := True;
end;

function LoadIntsOk(const AKey: string; out AData: TIntArray): Boolean;
begin
Inc(GIntLoads);
SetLength(AData, 2);
AData[0] := 100;
AData[1] := 200;
Result := True;
end;

function LoadFails(const AKey: string; out AData: TBytes): Boolean;
begin
Inc(GFailLoads);
SetLength(AData, 0);
Result := False;
end;

function LoadPartialThenFails(const AKey: string; out AData: TBytes): Boolean;
begin
// Writes a payload into the out slot, then reports failure — mimics a
// resource read that sizes its buffer before a failing ReadBuffer.
SetLength(AData, 4);
AData[0] := 1;
Result := False;
end;

type
TLazyPublishedCacheTests = class(TTestSuite)
private
procedure TestLoadsOnceAndPublishes;
procedure TestMemoizesFailureWithoutRetrying;
procedure TestFailedLoadDropsPartialData;
procedure TestPassesKeyToLoader;
procedure TestWorksForAnyPayloadType;
public
procedure SetupTests; override;
end;

procedure TLazyPublishedCacheTests.SetupTests;
begin
Test('Loads once and publishes data for warm reads', TestLoadsOnceAndPublishes);
Test('Memoizes load failure and does not retry', TestMemoizesFailureWithoutRetrying);
Test('Drops partial data when the loader fails', TestFailedLoadDropsPartialData);
Test('Passes the key through to the loader', TestPassesKeyToLoader);
Test('Works for any payload type', TestWorksForAnyPayloadType);
end;

procedure TLazyPublishedCacheTests.TestLoadsOnceAndPublishes;
var
Cache: TLazyPublishedCache<TBytes>;
begin
GBytesLoads := 0;
Cache.Init;
try
Expect<Boolean>(Cache.Ensure('k', @LoadBytesOk)).ToBe(True);
// Warm read: still available, loader is not invoked again.
Expect<Boolean>(Cache.Ensure('k', @LoadBytesOk)).ToBe(True);
Expect<Integer>(GBytesLoads).ToBe(1);
Expect<Integer>(Length(Cache.Data)).ToBe(3);
Expect<Integer>(Cache.Data[1]).ToBe(20);
finally
Cache.Done;
end;
end;

procedure TLazyPublishedCacheTests.TestMemoizesFailureWithoutRetrying;
var
Cache: TLazyPublishedCache<TBytes>;
begin
GFailLoads := 0;
Cache.Init;
try
Expect<Boolean>(Cache.Ensure('k', @LoadFails)).ToBe(False);
// Failure is memoized, so the loader is not re-attempted on the next call.
Expect<Boolean>(Cache.Ensure('k', @LoadFails)).ToBe(False);
Expect<Integer>(GFailLoads).ToBe(1);
Expect<Integer>(Length(Cache.Data)).ToBe(0);
finally
Cache.Done;
end;
end;

procedure TLazyPublishedCacheTests.TestFailedLoadDropsPartialData;
var
Cache: TLazyPublishedCache<TBytes>;
begin
Cache.Init;
try
Expect<Boolean>(Cache.Ensure('k', @LoadPartialThenFails)).ToBe(False);
// The payload the loader wrote into Data before failing must not survive.
Expect<Integer>(Length(Cache.Data)).ToBe(0);
finally
Cache.Done;
end;
end;

procedure TLazyPublishedCacheTests.TestPassesKeyToLoader;
var
Cache: TLazyPublishedCache<TBytes>;
begin
GBytesLoads := 0;
GLastKey := '';
Cache.Init;
try
Cache.Ensure('case-fold-key', @LoadBytesOk);
Expect<string>(GLastKey).ToBe('case-fold-key');
finally
Cache.Done;
end;
end;

procedure TLazyPublishedCacheTests.TestWorksForAnyPayloadType;
var
Cache: TLazyPublishedCache<TIntArray>;
begin
GIntLoads := 0;
Cache.Init;
try
Expect<Boolean>(Cache.Ensure('k', @LoadIntsOk)).ToBe(True);
Expect<Boolean>(Cache.Ensure('k', @LoadIntsOk)).ToBe(True);
Expect<Integer>(GIntLoads).ToBe(1);
Expect<Integer>(Cache.Data[0]).ToBe(100);
Expect<Integer>(Cache.Data[1]).ToBe(200);
finally
Cache.Done;
end;
end;

begin
TestRunnerProgram.AddSuite(TLazyPublishedCacheTests.Create('LazyPublishedCache'));
TestRunnerProgram.Run;
ExitCode := TestResultToExitCode;
end.
Loading
Loading