fix(utils): download tokenizer models locally via HTTP to avoid TF GCS C++ segfault on macOS by prince-shakyaa · Pull Request #661 · google-deepmind/gemma

prince-shakyaa · 2026-05-23T23:14:37Z

Description

Fixes #660
Fixes an issue where gemma fails to download/cache gs:// tokenizer paths efficiently, which triggers the TensorFlow C++ GCS client and results in a libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed crash on macOS Apple Silicon at Python interpreter exit.

By explicitly intercepting gs:// paths and downloading them via standard HTTP (urllib.request) directly into the local ~/.gemma/tokenizer/ cache, we completely bypass the C++ GCS client bug.

Additionally, this adds true local caching for tokenizer models (previously the code was checking the cache, but never actually saving to it if the file was missing, leading to the full model being fetched over the network on every run).

Changes

Modified gemma/gm/utils/_file_cache.py::maybe_get_from_cache to translate gs:// paths to https://storage.googleapis.com/ and download them natively to the cache directory before returning the local path.

…S C++ segfault on macOS

prince-shakyaa · 2026-05-23T23:36:49Z

While testing locally on macOS Apple Silicon, pytest workers were repeatedly segfaulting during teardown. I traced this to the TensorFlow C++ GCS client failing to shut down cleanly when loading the tokenizer directly from gs://.
This PR fixes the crash (and adds true local caching) by downloading the model via standard HTTP into ~/.gemma/tokenizer/ before loading it.

Let me know if you need me to make any changes.
Thank You.

fix(utils): download tokenizer models locally via HTTP to avoid TF GC…

57cabe0

…S C++ segfault on macOS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(utils): download tokenizer models locally via HTTP to avoid TF GCS C++ segfault on macOS#661

fix(utils): download tokenizer models locally via HTTP to avoid TF GCS C++ segfault on macOS#661
prince-shakyaa wants to merge 1 commit into
google-deepmind:mainfrom
prince-shakyaa:fix/macos-tokenizer-cache-download

prince-shakyaa commented May 23, 2026 •

edited

Loading

Uh oh!

prince-shakyaa commented May 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

prince-shakyaa commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Uh oh!

prince-shakyaa commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

prince-shakyaa commented May 23, 2026 •

edited

Loading

prince-shakyaa commented May 23, 2026 •

edited

Loading