diff --git a/docs/adr/0082-lazy-published-cache-primitive.md b/docs/adr/0082-lazy-published-cache-primitive.md new file mode 100644 index 00000000..2b0d106d --- /dev/null +++ b/docs/adr/0082-lazy-published-cache-primitive.md @@ -0,0 +1,42 @@ +# Unify embedded-data caches onto a lock-free publication primitive + +**Date:** 2026-06-28 +**Area:** `engine` +**Issue:** [#894](https://github.com/frostney/GocciaScript/issues/894) + +The engine lazily loads several immutable embedded-data tables — Unicode property ranges, RegExp case-fold and non-Unicode-uppercase pairs, the IANA time-zone and CLDR resource blobs, and the available-locale list — and then reads them concurrently from the regex/lexer worker-thread pool. [#813](https://github.com/frostney/GocciaScript/issues/813) converted the RegExp case-fold/uppercase accessor from "lock on every read" to a double-checked lock-free publication: the `Loaded` flag is published last behind a `WriteBarrier` and read behind a matching `ReadBarrier`, so a reader that observes `Loaded = True` also observes the fully-written table on weakly-ordered targets (AArch64). That was the first memory-barrier use in the engine and was kept deliberately local. + +The sibling caches shared the same lazy-init shape **without** the barrier discipline, in three different and individually-incorrect or costly ways: + +- `Goccia.Temporal.TimeZoneData` and `Goccia.Intl.CLDRData` read and wrote their `Cached…Resource` / `Cached…ResourceLoaded` globals with **no synchronization at all** — both a cold-load data race and a publication gap on weakly-ordered CPUs. +- `Goccia.RegExp.UnicodeData`'s own `TryReadEmbeddedResource` (the UCD blob) and `Goccia.Identifier` / `IntlLocaleResolver` (identifier ID_Start/ID_Continue ranges and the available-locale list) were **correct but always-locked**: they entered a `TRTLCriticalSection` and copied the cached dynamic array out on *every* call, including the lexer's per-non-ASCII-code-point identifier hot path. + +## Decision + +Introduce one generic record, `TLazyPublishedCache` in `source/shared/LazyPublishedCache.pas`, that owns the lazy one-shot load plus barrier-correct lock-free publication for an immutable value of any type `T`. It bundles `Data`, the `Loaded`/`Available` flags and the `Lock` into a single record, and exposes `Init`/`Done` (lock lifecycle) and `Ensure(const AKey; const ALoader): Boolean`. `Ensure` runs the cold load once under the lock, publishes `Loaded` last behind a `WriteBarrier`, and serves the warm path lock-free behind a `ReadBarrier`; load failure is memoized (`Loaded = True, Available = False`) so an absent or corrupt resource is not re-attempted. Each consumer supplies a small unit-level loader `function(const AKey; out AData: T): Boolean` and reads `Cache.Data` in place via a `const` argument, which makes no managed copy. + +All eight embedded-data caches across five units now consume it: + +| Unit | Caches | Was | +|------|--------|-----| +| `Goccia.Temporal.TimeZoneData` | TZ resource blob | unsynchronized | +| `Goccia.Intl.CLDRData` | CLDR resource blob | unsynchronized | +| `Goccia.RegExp.UnicodeData` | UCD blob; case-fold pairs; non-Unicode-uppercase pairs | lock-only + hand-rolled barrier DCL (#813) | +| `Goccia.Identifier` | ID_Start ranges; ID_Continue ranges | always-locked, copies out | +| `IntlLocaleResolver` | available-locale list | always-locked, copies out | + +The barrier discipline now lives in exactly one place instead of being re-asserted (or omitted) per call site, a mismatched `(data, flag, lock)` pairing is a compile-time error, and the warm-path zero-copy read that #813 gave the case-fold tables now also covers the lexer's identifier hot path. + +## Rejected alternatives + +- **Per-unit duplication of the idiom (status quo extended).** Apply the #813 barrier DCL in-place to each cache. Correct, but re-asserts the ordering at every site, keeps the loose `(array, flag, lock)` globals, and was exactly what this issue set out to remove. +- **Two-phase record API with no callback** (`TryWarm` / `BeginColdLoad` / `Publish`). Avoids the loader function pointer, but only *partly* centralizes the discipline: every one of the eight sites must still order the phases correctly, so the barrier rope stays in the callers. +- **A concrete `TBytes` resource-buffer helper plus separate handling for the derived tables.** Removes the most duplicate concrete code (the three near-identical resource readers) but introduces two abstractions and still leaves the pair/range/locale caches without a shared publication primitive — not the single primitive the issue targeted. + +## Consequences + +- Behavior-preserving: the full JavaScript suite passes in both interpreter and bytecode modes, and the affected RegExp/Intl/Temporal areas pass 100%. The publication ordering is the only semantic change. +- TimeZone and CLDR resource loads now take a one-shot cold-load lock (closing the data race) and memoize load failure (previously they retried the resource lookup on every miss). +- FPC lowers the barriers to `dmb ishld` / `dmb ishst` on AArch64 and to compiler barriers on x86 (TSO), so publication is correct on both without a per-read lock. +- A weak-memory publication race is non-deterministic and not reproducible by a JavaScript test, so the primitive's functional contract (load-once, memoize-failure, data-visible-iff-available, payload-agnostic) is locked in by the Pascal gate `LazyPublishedCache.Test`, matching the [thread-local cleanup](0078-thread-local-cleanup-registry.md) precedent of Pascal gates for concurrency infrastructure. +- Audit (per the issue): `Goccia.Temporal.TimeZone`'s `Cached…` tables are `threadvar` (per-thread, no cross-thread publication) and the ICU `EnsureLoaded` / `WindowsICULoadAttempted` paths are FFI library-binding inits rather than immutable data tables; both are a different shape and stay out of scope. diff --git a/docs/adr/README.md b/docs/adr/README.md index fad08841..a0cb3084 100644 --- a/docs/adr/README.md +++ b/docs/adr/README.md @@ -91,3 +91,4 @@ Durable architecture and implementation decisions for GocciaScript. New ADRs use - [0079 — Keep speculatively-scanned tokens across parenthesized-group probes](0079-keep-speculatively-scanned-tokens.md) - [0080 — FormatDouble first-hit precision scan](0080-formatdouble-first-hit-precision-scan.md) - [0081 — Reject shared value caches as a runtime optimization](0081-reject-value-caches-for-allocation-reduction.md) +- [0082 — Unify embedded-data caches onto a lock-free publication primitive](0082-lazy-published-cache-primitive.md) diff --git a/source/shared/IntlLocaleResolver.pas b/source/shared/IntlLocaleResolver.pas index 67cf8909..437ecc1d 100644 --- a/source/shared/IntlLocaleResolver.pas +++ b/source/shared/IntlLocaleResolver.pas @@ -24,13 +24,12 @@ implementation BCP47, IntlICU, + LazyPublishedCache, Goccia.Intl.CLDRData; var - AvailableLocalesCache: IntlTypes.TStringArray; - AvailableLocalesLoaded: Boolean; - AvailableLocalesLock: TRTLCriticalSection; + AvailableLocalesCache: TLazyPublishedCache; const CLDR_REGIONAL_AVAILABLE_LOCALES: array[0..8] of string = ( @@ -98,41 +97,43 @@ function LocaleWithoutUnicodeExtension(const ALocale: string): string; overload; Result := LocaleWithoutUnicodeExtension(Parsed); end; -// ECMA-402 ES2026 supportedLocalesOf constructors use [[AvailableLocales]]. -function AvailableLocaleList: IntlTypes.TStringArray; +function LoadAvailableLocales(const AKey: string; + out ALocales: IntlTypes.TStringArray): Boolean; var Available: IntlTypes.TStringArray; Canonical: string; I, Count: Integer; begin - EnterCriticalSection(AvailableLocalesLock); - try - if not AvailableLocalesLoaded then + SetLength(ALocales, 0); + if TryICUGetAvailableLocales(Available) then + begin + Count := 0; + SetLength(ALocales, Length(Available)); + for I := 0 to High(Available) do begin - SetLength(AvailableLocalesCache, 0); - if TryICUGetAvailableLocales(Available) then - begin - Count := 0; - SetLength(AvailableLocalesCache, Length(Available)); - for I := 0 to High(Available) do - begin - Canonical := CanonicalizeUnicodeLocaleId(Available[I]); - AppendUniqueLocale(AvailableLocalesCache, Count, Canonical); - end; - for I := Low(CLDR_REGIONAL_AVAILABLE_LOCALES) to - High(CLDR_REGIONAL_AVAILABLE_LOCALES) do - AppendUniqueLocale(AvailableLocalesCache, Count, - CLDR_REGIONAL_AVAILABLE_LOCALES[I]); - AppendUniqueLocale(AvailableLocalesCache, Count, DefaultLocale); - SetLength(AvailableLocalesCache, Count); - end; - AvailableLocalesLoaded := True; + Canonical := CanonicalizeUnicodeLocaleId(Available[I]); + AppendUniqueLocale(ALocales, Count, Canonical); end; - - Result := AvailableLocalesCache; - finally - LeaveCriticalSection(AvailableLocalesLock); + for I := Low(CLDR_REGIONAL_AVAILABLE_LOCALES) to + High(CLDR_REGIONAL_AVAILABLE_LOCALES) do + AppendUniqueLocale(ALocales, Count, + CLDR_REGIONAL_AVAILABLE_LOCALES[I]); + AppendUniqueLocale(ALocales, Count, DefaultLocale); + SetLength(ALocales, Count); end; + // The available-locale list is authoritative even when empty: if ICU is + // unavailable the engine resolves against an empty set and does not retry, so + // the load always publishes a usable value (preserving the pre-#894 + // memoize-on-first-call behavior, where the loaded flag was set + // unconditionally and the cached list returned as-is). + Result := True; +end; + +// ECMA-402 ES2026 supportedLocalesOf constructors use [[AvailableLocales]]. +function AvailableLocaleList: IntlTypes.TStringArray; +begin + AvailableLocalesCache.Ensure('', @LoadAvailableLocales); + Result := AvailableLocalesCache.Data; end; function SplitBySeparator(const AValue: string; const ASeparator: Char): IntlTypes.TStringArray; @@ -590,11 +591,9 @@ function DefaultLocale: string; end; initialization - InitCriticalSection(AvailableLocalesLock); - AvailableLocalesLoaded := False; + AvailableLocalesCache.Init; finalization - DoneCriticalSection(AvailableLocalesLock); - SetLength(AvailableLocalesCache, 0); + AvailableLocalesCache.Done; end. diff --git a/source/shared/LazyPublishedCache.Test.pas b/source/shared/LazyPublishedCache.Test.pas new file mode 100644 index 00000000..61e89443 --- /dev/null +++ b/source/shared/LazyPublishedCache.Test.pas @@ -0,0 +1,162 @@ +program LazyPublishedCache.Test; + +{$I Shared.inc} + +uses + SysUtils, + + LazyPublishedCache, + TestingPascalLibrary; + +type + TIntArray = array of Integer; + +var + GBytesLoads: Integer; + GIntLoads: Integer; + GFailLoads: Integer; + GLastKey: string; + +function LoadBytesOk(const AKey: string; out AData: TBytes): Boolean; +begin + Inc(GBytesLoads); + GLastKey := AKey; + SetLength(AData, 3); + AData[0] := 10; + AData[1] := 20; + AData[2] := 30; + Result := True; +end; + +function LoadIntsOk(const AKey: string; out AData: TIntArray): Boolean; +begin + Inc(GIntLoads); + SetLength(AData, 2); + AData[0] := 100; + AData[1] := 200; + Result := True; +end; + +function LoadFails(const AKey: string; out AData: TBytes): Boolean; +begin + Inc(GFailLoads); + SetLength(AData, 0); + Result := False; +end; + +function LoadPartialThenFails(const AKey: string; out AData: TBytes): Boolean; +begin + // Writes a payload into the out slot, then reports failure — mimics a + // resource read that sizes its buffer before a failing ReadBuffer. + SetLength(AData, 4); + AData[0] := 1; + Result := False; +end; + +type + TLazyPublishedCacheTests = class(TTestSuite) + private + procedure TestLoadsOnceAndPublishes; + procedure TestMemoizesFailureWithoutRetrying; + procedure TestFailedLoadDropsPartialData; + procedure TestPassesKeyToLoader; + procedure TestWorksForAnyPayloadType; + public + procedure SetupTests; override; + end; + +procedure TLazyPublishedCacheTests.SetupTests; +begin + Test('Loads once and publishes data for warm reads', TestLoadsOnceAndPublishes); + Test('Memoizes load failure and does not retry', TestMemoizesFailureWithoutRetrying); + Test('Drops partial data when the loader fails', TestFailedLoadDropsPartialData); + Test('Passes the key through to the loader', TestPassesKeyToLoader); + Test('Works for any payload type', TestWorksForAnyPayloadType); +end; + +procedure TLazyPublishedCacheTests.TestLoadsOnceAndPublishes; +var + Cache: TLazyPublishedCache; +begin + GBytesLoads := 0; + Cache.Init; + try + Expect(Cache.Ensure('k', @LoadBytesOk)).ToBe(True); + // Warm read: still available, loader is not invoked again. + Expect(Cache.Ensure('k', @LoadBytesOk)).ToBe(True); + Expect(GBytesLoads).ToBe(1); + Expect(Length(Cache.Data)).ToBe(3); + Expect(Cache.Data[1]).ToBe(20); + finally + Cache.Done; + end; +end; + +procedure TLazyPublishedCacheTests.TestMemoizesFailureWithoutRetrying; +var + Cache: TLazyPublishedCache; +begin + GFailLoads := 0; + Cache.Init; + try + Expect(Cache.Ensure('k', @LoadFails)).ToBe(False); + // Failure is memoized, so the loader is not re-attempted on the next call. + Expect(Cache.Ensure('k', @LoadFails)).ToBe(False); + Expect(GFailLoads).ToBe(1); + Expect(Length(Cache.Data)).ToBe(0); + finally + Cache.Done; + end; +end; + +procedure TLazyPublishedCacheTests.TestFailedLoadDropsPartialData; +var + Cache: TLazyPublishedCache; +begin + Cache.Init; + try + Expect(Cache.Ensure('k', @LoadPartialThenFails)).ToBe(False); + // The payload the loader wrote into Data before failing must not survive. + Expect(Length(Cache.Data)).ToBe(0); + finally + Cache.Done; + end; +end; + +procedure TLazyPublishedCacheTests.TestPassesKeyToLoader; +var + Cache: TLazyPublishedCache; +begin + GBytesLoads := 0; + GLastKey := ''; + Cache.Init; + try + Cache.Ensure('case-fold-key', @LoadBytesOk); + Expect(GLastKey).ToBe('case-fold-key'); + finally + Cache.Done; + end; +end; + +procedure TLazyPublishedCacheTests.TestWorksForAnyPayloadType; +var + Cache: TLazyPublishedCache; +begin + GIntLoads := 0; + Cache.Init; + try + Expect(Cache.Ensure('k', @LoadIntsOk)).ToBe(True); + Expect(Cache.Ensure('k', @LoadIntsOk)).ToBe(True); + Expect(GIntLoads).ToBe(1); + Expect(Cache.Data[0]).ToBe(100); + Expect(Cache.Data[1]).ToBe(200); + finally + Cache.Done; + end; +end; + +begin + TestRunnerProgram.AddSuite(TLazyPublishedCacheTests.Create('LazyPublishedCache')); + TestRunnerProgram.Run; + ExitCode := TestResultToExitCode; +end. diff --git a/source/shared/LazyPublishedCache.pas b/source/shared/LazyPublishedCache.pas new file mode 100644 index 00000000..02c7cdff --- /dev/null +++ b/source/shared/LazyPublishedCache.pas @@ -0,0 +1,103 @@ +unit LazyPublishedCache; + +{$I Shared.inc} + +interface + +type + { Lazy one-shot load plus barrier-correct lock-free publication of an + immutable value of any type T. + + The engine caches several immutable embedded-data tables (Unicode + property ranges, case-fold and uppercase pairs, IANA time-zone and CLDR + resource blobs, available-locale lists) that are loaded once and then read + concurrently from a worker-thread pool. Each cache wants the same shape: a + cold one-shot load under mutual exclusion, then a warm read path that never + enters the critical section. + + The Loaded flag is published LAST. The cold path writes Data and Available + first, issues a WriteBarrier, then sets Loaded := True; the warm path reads + Loaded and, on a hit, issues a matching ReadBarrier before reading + Available/Data. A reader that observes Loaded = True therefore also observes + the fully written immutable value on weakly-ordered targets (e.g. AArch64, + where FPC lowers the barriers to dmb ishld/ishst); on strongly-ordered + targets (x86 TSO) the barriers still act as compiler barriers. This is the + double-checked publication idiom introduced for the RegExp case-fold tables + in #813, generalized so every cache shares one barrier-correct + implementation instead of re-asserting the ordering per call site. + + Load failure is memoized as well (Loaded = True, Available = False), so an + absent or corrupt resource is not re-attempted on every call. Bundling + Data, the flags and the Lock into one record makes a mismatched + (data, flag, lock) pairing a compile-time error rather than a latent + convention. Callers that read hot tables in place pass Data as a const + argument (TLazyPublishedCache.Data), which makes no managed copy. } + TLazyPublishedCache = record + public type + { Cold-load callback. Receives the entry/property/resource key the cache + was created for, fills AData, and returns whether the load produced a + usable value. Invoked at most once, under the cache lock. } + TLoader = function(const AKey: string; out AData: T): Boolean; + var + Data: T; + Loaded: Boolean; + Available: Boolean; + Lock: TRTLCriticalSection; + { Prepare the lock. Call once from the owning unit's initialization. } + procedure Init; + { Release the lock. Call once from the owning unit's finalization. } + procedure Done; + { Ensure the value is loaded and published, then return whether it is + available. The warm path takes no lock. } + function Ensure(const AKey: string; const ALoader: TLoader): Boolean; + end; + +implementation + +procedure TLazyPublishedCache.Init; +begin + InitCriticalSection(Lock); + Loaded := False; + Available := False; +end; + +procedure TLazyPublishedCache.Done; +begin + DoneCriticalSection(Lock); +end; + +function TLazyPublishedCache.Ensure(const AKey: string; + const ALoader: TLoader): Boolean; +begin + if Loaded then + begin + ReadBarrier; + Result := Available; + Exit; + end; + + EnterCriticalSection(Lock); + try + if Loaded then + begin + Result := Available; + Exit; + end; + + Available := ALoader(AKey, Data); + // A loader that partially fills Data and then fails (e.g. a resource read + // that sizes its buffer before a failing ReadBuffer) would otherwise leave + // that payload resident for the process lifetime, since Loaded stays True + // and the slot is never reloaded. Drop it so a failed load publishes an + // empty value rather than a stale partial one. + if not Available then + Data := Default(T); + WriteBarrier; + Loaded := True; + Result := Available; + finally + LeaveCriticalSection(Lock); + end; +end; + +end. diff --git a/source/units/Goccia.Identifier.pas b/source/units/Goccia.Identifier.pas index 14bbf8e5..b0c620bf 100644 --- a/source/units/Goccia.Identifier.pas +++ b/source/units/Goccia.Identifier.pas @@ -12,17 +12,11 @@ implementation uses SysUtils, + LazyPublishedCache, UnicodeICU, Goccia.RegExp.UnicodeData; -type - TIdentifierRangeCache = record - Ranges: TUnicodePropertyRangeArray; - Loaded: Boolean; - Available: Boolean; - end; - const IDENTIFIER_START_PROPERTY = 'ID_Start'; IDENTIFIER_PART_PROPERTY = 'ID_Continue'; @@ -30,9 +24,8 @@ TIdentifierRangeCache = record ZERO_WIDTH_JOINER_CODE_POINT = $200D; var - IdentifierStartCache: TIdentifierRangeCache; - IdentifierPartCache: TIdentifierRangeCache; - IdentifierRangeLock: TRTLCriticalSection; + IdentifierStartCache: TLazyPublishedCache; + IdentifierPartCache: TLazyPublishedCache; function IsASCIIIdentifierStartCodePoint(ACodePoint: Cardinal): Boolean; begin @@ -55,39 +48,6 @@ function TryLoadIdentifierRanges(const APropertyName: string; Result := TryICUGetUnicodePropertyRanges(APropertyName, '', ARanges); end; -function TryGetCachedIdentifierRanges(const APropertyName: string; - var ACache: TIdentifierRangeCache; - out ARanges: TUnicodePropertyRangeArray): Boolean; -var - LoadedRanges: TUnicodePropertyRangeArray; -begin - EnterCriticalSection(IdentifierRangeLock); - try - if ACache.Loaded then - begin - ARanges := ACache.Ranges; - Result := ACache.Available; - Exit; - end; - - ACache.Loaded := True; - ACache.Available := False; - SetLength(ACache.Ranges, 0); - SetLength(LoadedRanges, 0); - - if TryLoadIdentifierRanges(APropertyName, LoadedRanges) then - begin - ACache.Ranges := LoadedRanges; - ACache.Available := True; - end; - - ARanges := ACache.Ranges; - Result := ACache.Available; - finally - LeaveCriticalSection(IdentifierRangeLock); - end; -end; - function RangeContainsCodePoint(const ARanges: TUnicodePropertyRangeArray; ACodePoint: Cardinal): Boolean; var @@ -110,8 +70,6 @@ function RangeContainsCodePoint(const ARanges: TUnicodePropertyRangeArray; // ES2026 §12.7 IdentifierStartChar function IsIdentifierStartCodePoint(ACodePoint: Cardinal): Boolean; -var - Ranges: TUnicodePropertyRangeArray; begin if ACodePoint > $10FFFF then Exit(False); @@ -119,16 +77,14 @@ function IsIdentifierStartCodePoint(ACodePoint: Cardinal): Boolean; Exit(True); if ACodePoint <= $7F then Exit(False); - if not TryGetCachedIdentifierRanges(IDENTIFIER_START_PROPERTY, - IdentifierStartCache, Ranges) then + if not IdentifierStartCache.Ensure(IDENTIFIER_START_PROPERTY, + @TryLoadIdentifierRanges) then Exit(False); - Result := RangeContainsCodePoint(Ranges, ACodePoint); + Result := RangeContainsCodePoint(IdentifierStartCache.Data, ACodePoint); end; // ES2026 §12.7 IdentifierPartChar function IsIdentifierPartCodePoint(ACodePoint: Cardinal): Boolean; -var - Ranges: TUnicodePropertyRangeArray; begin if ACodePoint > $10FFFF then Exit(False); @@ -139,20 +95,18 @@ function IsIdentifierPartCodePoint(ACodePoint: Cardinal): Boolean; Exit(True); if ACodePoint <= $7F then Exit(False); - if not TryGetCachedIdentifierRanges(IDENTIFIER_PART_PROPERTY, - IdentifierPartCache, Ranges) then + if not IdentifierPartCache.Ensure(IDENTIFIER_PART_PROPERTY, + @TryLoadIdentifierRanges) then Exit(False); - Result := RangeContainsCodePoint(Ranges, ACodePoint); + Result := RangeContainsCodePoint(IdentifierPartCache.Data, ACodePoint); end; initialization - InitCriticalSection(IdentifierRangeLock); - IdentifierStartCache.Loaded := False; - IdentifierStartCache.Available := False; - IdentifierPartCache.Loaded := False; - IdentifierPartCache.Available := False; + IdentifierStartCache.Init; + IdentifierPartCache.Init; finalization - DoneCriticalSection(IdentifierRangeLock); + IdentifierPartCache.Done; + IdentifierStartCache.Done; end. diff --git a/source/units/Goccia.Intl.CLDRData.pas b/source/units/Goccia.Intl.CLDRData.pas index 190100d4..ebb48c55 100644 --- a/source/units/Goccia.Intl.CLDRData.pas +++ b/source/units/Goccia.Intl.CLDRData.pas @@ -47,7 +47,8 @@ implementation SysUtils, EmbeddedResourceReader, - Generated.IntlData; + Generated.IntlData, + LazyPublishedCache; const CLDR_RCDATA_RESOURCE_TYPE = MAKEINTRESOURCE(10); @@ -55,26 +56,18 @@ implementation (Ord('G'), Ord('O'), Ord('C'), Ord('C'), Ord('I'), Ord('A'), Ord('C'), Ord('L')); var - CachedCLDRResource: TBytes; - CachedCLDRResourceLoaded: Boolean; + CLDRResourceCache: TLazyPublishedCache; -function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +function LoadCLDRResource(const AKey: string; out ABuffer: TBytes): Boolean; var Stream: TResourceStream; BufferSize: Integer; begin - if CachedCLDRResourceLoaded then - begin - ABuffer := CachedCLDRResource; - Result := Length(ABuffer) > 0; - Exit; - end; - Result := False; SetLength(ABuffer, 0); Stream := nil; try - Stream := TResourceStream.Create(HInstance, GeneratedIntlDataResourceName, + Stream := TResourceStream.Create(HInstance, AKey, CLDR_RCDATA_RESOURCE_TYPE); if Stream.Size > High(Integer) then begin @@ -88,14 +81,21 @@ function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; Stream.ReadBuffer(ABuffer[0], BufferSize); Stream.Free; Stream := nil; - CachedCLDRResource := ABuffer; - CachedCLDRResourceLoaded := True; Result := True; except Stream.Free; end; end; +function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +begin + if CLDRResourceCache.Ensure(GeneratedIntlDataResourceName, @LoadCLDRResource) then + ABuffer := CLDRResourceCache.Data + else + SetLength(ABuffer, 0); + Result := Length(ABuffer) > 0; +end; + function TryGetSectionData(const ASectionName: string; out AResource: TBytes; out ADataOffset, ADataLength: Integer): Boolean; var @@ -770,6 +770,12 @@ function TryGetLocaleWeekInfo(const ARegion: string; out AFirstDay, TryParseInteger(Copy(Value, ThirdColon + 1, Length(Value) - ThirdColon), AMinimalDays); end; +initialization + CLDRResourceCache.Init; + +finalization + CLDRResourceCache.Done; + {$ELSE} function TryGetLikelySubtags(const ATag: string; out AMaximized: string): Boolean; diff --git a/source/units/Goccia.RegExp.UnicodeData.pas b/source/units/Goccia.RegExp.UnicodeData.pas index 6f96b3bd..e96a8530 100644 --- a/source/units/Goccia.RegExp.UnicodeData.pas +++ b/source/units/Goccia.RegExp.UnicodeData.pas @@ -28,7 +28,8 @@ implementation SysUtils, EmbeddedResourceReader, - Generated.UnicodeData; + Generated.UnicodeData, + LazyPublishedCache; type TCodePointPair = record @@ -111,61 +112,47 @@ function TryExtractCaseFoldPairs(const ABuffer: TBytes; end; var - CachedUCDResource: TBytes; - CachedUCDResourceLoaded: Boolean; - CachedUCDResourceLock: TRTLCriticalSection; - CachedCaseFoldPairs: TCodePointPairArray; - CachedCaseFoldPairsLoaded: Boolean; - CachedCaseFoldPairsAvailable: Boolean; - CachedCaseFoldPairsLock: TRTLCriticalSection; - CachedNonUnicodeUppercasePairs: TCodePointPairArray; - CachedNonUnicodeUppercasePairsLoaded: Boolean; - CachedNonUnicodeUppercasePairsAvailable: Boolean; - CachedNonUnicodeUppercasePairsLock: TRTLCriticalSection; + UCDResourceCache: TLazyPublishedCache; + CaseFoldPairsCache: TLazyPublishedCache; + NonUnicodeUppercasePairsCache: TLazyPublishedCache; -function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +function LoadUCDResource(const AKey: string; out ABuffer: TBytes): Boolean; var Stream: TResourceStream; BufferSize: Integer; begin - EnterCriticalSection(CachedUCDResourceLock); + Result := False; + SetLength(ABuffer, 0); + Stream := nil; try - if CachedUCDResourceLoaded then + Stream := TResourceStream.Create(HInstance, AKey, UCD_RCDATA_RESOURCE_TYPE); + if Stream.Size > High(Integer) then begin - ABuffer := CachedUCDResource; - Result := Length(ABuffer) > 0; + Stream.Free; Exit; end; - Result := False; - SetLength(ABuffer, 0); + BufferSize := Integer(Stream.Size); + SetLength(ABuffer, BufferSize); + if BufferSize > 0 then + Stream.ReadBuffer(ABuffer[0], BufferSize); + Stream.Free; Stream := nil; - try - Stream := TResourceStream.Create(HInstance, GeneratedUnicodeDataResourceName, - UCD_RCDATA_RESOURCE_TYPE); - if Stream.Size > High(Integer) then - begin - Stream.Free; - Exit; - end; - - BufferSize := Integer(Stream.Size); - SetLength(ABuffer, BufferSize); - if BufferSize > 0 then - Stream.ReadBuffer(ABuffer[0], BufferSize); - Stream.Free; - Stream := nil; - CachedUCDResource := ABuffer; - CachedUCDResourceLoaded := True; - Result := True; - except - Stream.Free; - end; - finally - LeaveCriticalSection(CachedUCDResourceLock); + Result := True; + except + Stream.Free; end; end; +function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +begin + if UCDResourceCache.Ensure(GeneratedUnicodeDataResourceName, @LoadUCDResource) then + ABuffer := UCDResourceCache.Data + else + SetLength(ABuffer, 0); + Result := Length(ABuffer) > 0; +end; + function TryGetEmbeddedPropertyRanges(const AKey: string; out ARanges: TUnicodePropertyRangeArray): Boolean; var @@ -188,70 +175,43 @@ function TryGetEmbeddedPropertyRanges(const AKey: string; Result := TryExtractRanges(Resource, Container, Entry, ARanges); end; -{ Lazily load an immutable case-fold/uppercase pair table and publish it for - lock-free reads. The table is consulted on every code-point comparison of an - icase backreference and of a unicode+icase \b assertion, so after the cold - one-shot load the warm path must not enter the critical section (issue #813). - ACachedPairsLoaded is published LAST, after the table data and - ACachedPairsAvailable, with a write barrier on the cold path and a matching - read barrier on the warm path; a reader that observes ACachedPairsLoaded = - True therefore also observes the fully-written table on weakly-ordered - targets (e.g. AArch64). Load failure is memoized too (Loaded = True, - Available = False) so an absent or corrupt resource is not re-attempted on - every call. Callers binary-search the module-level array in place (passed as - a const argument, which makes no managed copy), avoiding per-comparison - dynamic-array refcount churn. } -function EnsureEmbeddedPairsReady(const AKey: string; - var ACachedPairs: TCodePointPairArray; - var ACachedPairsLoaded, ACachedPairsAvailable: Boolean; - var ACachedPairsLock: TRTLCriticalSection): Boolean; +{ Lazily load the immutable case-fold/uppercase pair table named AKey. The cold + load (resource read, container/entry lookup, pair extraction) runs once; the + table is then consulted on every code-point comparison of an icase + backreference and of a unicode+icase \b assertion, so the warm path must not + take a lock. The barrier-correct lazy-publication discipline lives in + TLazyPublishedCache (introduced for these tables in #813, unified across the + engine's embedded-data caches in #894): the loaded flag is published last + behind a write barrier and read behind a matching read barrier, so a + concurrent regex worker that observes the table as loaded also observes its + fully-written contents on weakly-ordered targets. Load failure is memoized so + an absent or corrupt resource is not re-attempted. Callers binary-search the + cache's Data array in place (passed as a const argument, which makes no + managed copy), avoiding per-comparison dynamic-array refcount churn. } +function LoadEmbeddedPairs(const AKey: string; + out APairs: TCodePointPairArray): Boolean; var Resource: TBytes; Container: TEmbeddedResourceContainer; Entry: TEmbeddedResourceEntry; begin - if ACachedPairsLoaded then - begin - ReadBarrier; - Result := ACachedPairsAvailable; - Exit; - end; - - EnterCriticalSection(ACachedPairsLock); - try - if ACachedPairsLoaded then - begin - Result := ACachedPairsAvailable; - Exit; - end; - - SetLength(ACachedPairs, 0); - ACachedPairsAvailable := - TryReadEmbeddedResource(Resource) and - TryReadEmbeddedResourceContainer(Resource, UCD_MAGIC, Container) and - TryFindEmbeddedResourceEntry(Resource, AKey, Container, Entry) and - TryExtractCaseFoldPairs(Resource, Container, Entry, ACachedPairs); - - WriteBarrier; - ACachedPairsLoaded := True; - Result := ACachedPairsAvailable; - finally - LeaveCriticalSection(ACachedPairsLock); - end; + SetLength(APairs, 0); + Result := + TryReadEmbeddedResource(Resource) and + TryReadEmbeddedResourceContainer(Resource, UCD_MAGIC, Container) and + TryFindEmbeddedResourceEntry(Resource, AKey, Container, Entry) and + TryExtractCaseFoldPairs(Resource, Container, Entry, APairs); end; function EnsureCaseFoldPairsReady: Boolean; begin - Result := EnsureEmbeddedPairsReady(CASE_FOLDING_ENTRY_KEY, CachedCaseFoldPairs, - CachedCaseFoldPairsLoaded, CachedCaseFoldPairsAvailable, - CachedCaseFoldPairsLock); + Result := CaseFoldPairsCache.Ensure(CASE_FOLDING_ENTRY_KEY, @LoadEmbeddedPairs); end; function EnsureNonUnicodeUppercasePairsReady: Boolean; begin - Result := EnsureEmbeddedPairsReady(NON_UNICODE_UPPERCASE_ENTRY_KEY, - CachedNonUnicodeUppercasePairs, CachedNonUnicodeUppercasePairsLoaded, - CachedNonUnicodeUppercasePairsAvailable, CachedNonUnicodeUppercasePairsLock); + Result := NonUnicodeUppercasePairsCache.Ensure(NON_UNICODE_UPPERCASE_ENTRY_KEY, + @LoadEmbeddedPairs); end; function TryFindPairTarget(const APairs: TCodePointPairArray; @@ -290,7 +250,7 @@ function TryGetUnicodeSimpleCaseFold(ACodePoint: Cardinal; out AFoldedCodePoint: Cardinal): Boolean; begin if EnsureCaseFoldPairsReady then - Result := TryFindPairTarget(CachedCaseFoldPairs, ACodePoint, + Result := TryFindPairTarget(CaseFoldPairsCache.Data, ACodePoint, AFoldedCodePoint) else begin @@ -393,7 +353,7 @@ function TryGetRegExpNonUnicodeUppercase(ACodePoint: Cardinal; out AUpperCodePoint: Cardinal): Boolean; begin if EnsureNonUnicodeUppercasePairsReady then - Result := TryFindPairTarget(CachedNonUnicodeUppercasePairs, ACodePoint, + Result := TryFindPairTarget(NonUnicodeUppercasePairsCache.Data, ACodePoint, AUpperCodePoint) else begin @@ -430,7 +390,7 @@ procedure ExpandUnicodeSimpleCaseFolding( var ARanges: TUnicodePropertyRangeArray); begin if EnsureCaseFoldPairsReady then - ExpandCaseEquivalence(ARanges, CachedCaseFoldPairs); + ExpandCaseEquivalence(ARanges, CaseFoldPairsCache.Data); end; procedure ReduceUnicodeSimpleCaseFoldClosed( @@ -448,17 +408,17 @@ procedure ReduceUnicodeSimpleCaseFoldClosed( for I := 0 to High(ARanges) do OriginalRanges[I] := ARanges[I]; - for I := 0 to High(CachedCaseFoldPairs) do + for I := 0 to High(CaseFoldPairsCache.Data) do begin - Target := CachedCaseFoldPairs[I].Target; + Target := CaseFoldPairsCache.Data[I].Target; HasInside := RangeContainsCodePoint(OriginalRanges, Target); HasOutside := not HasInside; - for J := 0 to High(CachedCaseFoldPairs) do - if CachedCaseFoldPairs[J].Target = Target then + for J := 0 to High(CaseFoldPairsCache.Data) do + if CaseFoldPairsCache.Data[J].Target = Target then begin if RangeContainsCodePoint(OriginalRanges, - CachedCaseFoldPairs[J].Source) then + CaseFoldPairsCache.Data[J].Source) then HasInside := True else HasOutside := True; @@ -468,9 +428,9 @@ procedure ReduceUnicodeSimpleCaseFoldClosed( Continue; RemoveFoldRange(ARanges, Target); - for J := 0 to High(CachedCaseFoldPairs) do - if CachedCaseFoldPairs[J].Target = Target then - RemoveFoldRange(ARanges, CachedCaseFoldPairs[J].Source); + for J := 0 to High(CaseFoldPairsCache.Data) do + if CaseFoldPairsCache.Data[J].Target = Target then + RemoveFoldRange(ARanges, CaseFoldPairsCache.Data[J].Source); end; end; @@ -478,18 +438,18 @@ procedure ExpandRegExpNonUnicodeCaseFolding( var ARanges: TUnicodePropertyRangeArray); begin if EnsureNonUnicodeUppercasePairsReady then - ExpandCaseEquivalence(ARanges, CachedNonUnicodeUppercasePairs); + ExpandCaseEquivalence(ARanges, NonUnicodeUppercasePairsCache.Data); end; initialization - InitCriticalSection(CachedUCDResourceLock); - InitCriticalSection(CachedCaseFoldPairsLock); - InitCriticalSection(CachedNonUnicodeUppercasePairsLock); + UCDResourceCache.Init; + CaseFoldPairsCache.Init; + NonUnicodeUppercasePairsCache.Init; finalization - DoneCriticalSection(CachedNonUnicodeUppercasePairsLock); - DoneCriticalSection(CachedCaseFoldPairsLock); - DoneCriticalSection(CachedUCDResourceLock); + NonUnicodeUppercasePairsCache.Done; + CaseFoldPairsCache.Done; + UCDResourceCache.Done; {$ELSE} diff --git a/source/units/Goccia.Temporal.TimeZoneData.pas b/source/units/Goccia.Temporal.TimeZoneData.pas index d19928cb..a007ef5a 100644 --- a/source/units/Goccia.Temporal.TimeZoneData.pas +++ b/source/units/Goccia.Temporal.TimeZoneData.pas @@ -22,7 +22,8 @@ implementation Classes, EmbeddedResourceReader, - Generated.TimeZoneData; + Generated.TimeZoneData, + LazyPublishedCache; const TIME_ZONE_RCDATA_RESOURCE_TYPE = MAKEINTRESOURCE(10); @@ -30,26 +31,18 @@ implementation (Ord('G'), Ord('O'), Ord('C'), Ord('C'), Ord('I'), Ord('A'), Ord('T'), Ord('Z')); var - CachedTZResource: TBytes; - CachedTZResourceLoaded: Boolean; + TZResourceCache: TLazyPublishedCache; -function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +function LoadTZResource(const AKey: string; out ABuffer: TBytes): Boolean; var Stream: TResourceStream; BufferSize: Integer; begin - if CachedTZResourceLoaded then - begin - ABuffer := CachedTZResource; - Result := Length(ABuffer) > 0; - Exit; - end; - Result := False; SetLength(ABuffer, 0); Stream := nil; try - Stream := TResourceStream.Create(HInstance, GeneratedTimeZoneDataResourceName, + Stream := TResourceStream.Create(HInstance, AKey, TIME_ZONE_RCDATA_RESOURCE_TYPE); if Stream.Size > High(Integer) then begin @@ -63,14 +56,21 @@ function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; Stream.ReadBuffer(ABuffer[0], BufferSize); Stream.Free; Stream := nil; - CachedTZResource := ABuffer; - CachedTZResourceLoaded := True; Result := True; except Stream.Free; end; end; +function TryReadEmbeddedResource(out ABuffer: TBytes): Boolean; +begin + if TZResourceCache.Ensure(GeneratedTimeZoneDataResourceName, @LoadTZResource) then + ABuffer := TZResourceCache.Data + else + SetLength(ABuffer, 0); + Result := Length(ABuffer) > 0; +end; + function TryGetEmbeddedTimeZoneFile(const ATimeZone: string; out ABytes: TBytes): Boolean; var Resource: TBytes; @@ -179,6 +179,12 @@ function TryCanonicalizeEmbeddedTimeZoneFileName(const ATimeZone: string; end; end; +initialization + TZResourceCache.Init; + +finalization + TZResourceCache.Done; + {$ELSE} function TryGetEmbeddedTimeZoneFile(const ATimeZone: string; out ABytes: TBytes): Boolean;