Skip to content

jedisct1/hf-mount-encrypted

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

219 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hf-mount with client-side encryption

image

Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.

This fork includes transparent client-side encryption.

Bucket creation:

hf buckets create my-bucket

Generate a 32-byte secret key in a file named key and point the mount at it:

# 32 random bytes (or 64 hex characters)
head -c 32 /dev/urandom > key.bin
hf mount start --encryption-key-file key.bin bucket myuser/my-bucket /tmp/data

See "Client-side encryption" below for more information.

Commands will pick up your HF_TOKEN from the environment, or you can pass it explicitly with --hf-token.

Then use your local folders as usual:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("/tmp/gpt-oss")  # reads on demand, no download step

hf-mount exposes Hugging Face Buckets and Hub repos as a local filesystem via FUSE or NFS. Files are fetched lazily on first read, so only the bytes your code actually touches ever hit the network.

Two backends are available:

  • NFS (recommended) -- works everywhere, no root, no kernel extension
  • FUSE -- tighter kernel integration, requires root or macFUSE on macOS

Agentic storage: Agents don't require complex APIs or SDKs, they thrive on the filesystem: ls, cat, find, grep, and the power of composable UNIX pipelines.

hf-mount demo gif

Install

Homebrew (macOS, Linux)

brew install hf-mount

On macOS, this installs the NFS backend only (hf-mount, hf-mount-nfs). For the FUSE backend on macOS, download the binary manually or build from source — macFUSE is closed-source and not distributable through homebrew-core.

Manual download

Binaries are available on GitHub Releases:

Platform Daemon NFS FUSE
Linux x86_64 hf-mount-x86_64-linux hf-mount-nfs-x86_64-linux hf-mount-fuse-x86_64-linux
Linux aarch64 hf-mount-aarch64-linux hf-mount-nfs-aarch64-linux hf-mount-fuse-aarch64-linux
macOS Apple Silicon hf-mount-arm64-apple-darwin hf-mount-nfs-arm64-apple-darwin hf-mount-fuse-arm64-apple-darwin

System dependencies (FUSE only)

The NFS backend has no system dependencies. For FUSE:

Linux: sudo apt-get install -y fuse3 (pre-built binaries only need the runtime; building from source also requires libfuse3-dev)

macOS: install macFUSE (brew install macfuse, requires reboot on first install)

Build from source

Requires Rust 1.89+.

# NFS only (no system deps, works everywhere)
cargo build --release --features nfs

# FUSE (requires macFUSE on macOS, fuse3 on Linux)
cargo build --release --features fuse

# All backends
cargo build --release --features fuse,nfs

Binaries: target/release/hf-mount, target/release/hf-mount-nfs, target/release/hf-mount-fuse

Best for / Not for

Best for:

  • Loading models and datasets without downloading the full repo
  • Browsing repo contents (ls, cat, find) without cloning
  • Read-heavy ML workloads (training, inference, evaluation)
  • Environments where disk space is limited

Not for:

  • General-purpose networked filesystem (no multi-writer support, no cross-node file locking)
  • Latency-sensitive random I/O (first reads require network round-trips)
  • Workloads that need strong consistency (files can be stale for up to 10 s)
  • Heavy concurrent writes from multiple mounts (last writer wins, no conflict detection)
  • Editing files with text editors in default (streaming) mode (use --advanced-writes)

Advisory file locks (flock, fcntl POSIX record locks) are supported locally on a single mount on both backends — enough for Python filelock, huggingface_hub, datasets, and similar cache-coordination use cases within one machine. They are not coordinated across multiple clients.

See Consistency model for details.

Usage

Mount a repo (read-only)

# Public model (no token needed)
hf-mount start repo openai/gpt-oss-20b /tmp/model

# Private model
hf-mount start --hf-token $HF_TOKEN repo myorg/my-private-model /tmp/model

# Dataset
hf-mount start repo datasets/open-index/hacker-news /tmp/hn

# Specific revision
hf-mount start repo openai-community/gpt2 /tmp/gpt2 --revision v1.0

# Subfolder only
hf-mount start repo openai-community/gpt2/onnx /tmp/onnx

Mount a Bucket (read-write)

Buckets are S3-like object storage on the Hub, designed for large-scale mutable data (training checkpoints, logs, artifacts) without git version control.

hf-mount start --hf-token $HF_TOKEN bucket myuser/my-bucket /tmp/data

# Read-only
hf-mount start --hf-token $HF_TOKEN --read-only bucket myuser/my-bucket /tmp/data

# Subfolder only
hf-mount start --hf-token $HF_TOKEN bucket myuser/my-bucket/checkpoints /tmp/ckpts

Manage mounts

hf-mount status                  # list running mounts
hf-mount stop /tmp/data          # stop and unmount

Logs are written to ~/.hf-mount/logs/. PID files are stored in ~/.hf-mount/pids/.

FUSE backend

By default, hf-mount uses NFS. Pass --fuse for tighter kernel integration (page cache invalidation, per-file metadata revalidation). Requires fuse3 on Linux or macFUSE on macOS.

hf-mount start --fuse --hf-token $HF_TOKEN bucket myuser/my-bucket /mnt/data

Foreground mode

For scripts, containers, or debugging, use the backend binaries directly (they run in the foreground):

hf-mount-nfs repo gpt2 /tmp/gpt2
hf-mount-fuse --hf-token $HF_TOKEN bucket myuser/my-bucket /mnt/data

macOS: launch as a daemon with launchd

To have hf-mount start automatically on login, create a LaunchAgent:

label=co.huggingface.hf-mount

mkdir -p ~/Library/LaunchAgents

cat > ~/Library/LaunchAgents/$label.plist <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>$label</string>
    <key>ProgramArguments</key>
    <array>
        <string>$HOME/.local/bin/hf-mount-nfs</string>
        <string>repo</string>
        <string>openai/gpt-oss-20b</string>
        <string>/tmp/gpt-oss</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
    <key>KeepAlive</key>
    <true/>
    <key>StandardOutPath</key>
    <string>/tmp/hf-mount.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/hf-mount.log</string>
</dict>
</plist>
EOF

launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/$label.plist

To stop: launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/$label.plist

Unmount

umount /tmp/data                 # NFS or FUSE (macOS)
fusermount -u /tmp/data          # FUSE (Linux)
hf-mount stop /tmp/data          # daemon mounts

Graceful shutdown (SIGTERM / CSI sidecar)

On SIGTERM the sidecar bounds the dirty-data flush (--flush-shutdown-timeout-ms) and disarms the kernel-cache invalidators before draining. This is deliberate: a FUSE_NOTIFY_INVAL_INODE writev issued during teardown can block uninterruptibly in the kernel (waiting on a folio under writeback the exiting daemon can no longer complete), leaving a D-state thread that even exit_group can't reap — an unkillable pod. Disarming the invalidator avoids creating that wedge; the CSI driver aborting the FUSE connection on NodeUnpublishVolume is the kernel-level backstop for any already-in-flight notify.

The same FUSE_NOTIFY_INVAL_INODE writev can also wedge at runtime (not just on shutdown): when the poll loop detects a remote change to a file the app currently has open, a full page-cache invalidation blocks in-kernel on a folio lock held by the app's in-flight read(), which is itself waiting for the daemon — a deadlock. Two guards prevent this: invalidations targeting an inode with open handles drop attributes only (a negative-offset notify the kernel never lets touch pages), and the blocking writev runs on the runtime's blocking pool rather than a core worker, so it can never starve FUSE request servicing. The trade-off: a file with a long-lived open handle won't see remote content updates refreshed in its page cache until the handle closes. On close-and-reopen the kernel revalidates the (attr-only-invalidated) attributes and, via the negotiated FUSE_AUTO_INVAL_DATA, drops the stale pages itself when it sees the new mtime/size — so a stale read only persists if a remote content change preserves both mtime and size on a file that was open at the moment of the change.

Options

Flag Default Description
--hf-token $HF_TOKEN HF API token (required for private repos/buckets)
--hub-endpoint https://huggingface.co Hub API endpoint
--cache-dir /tmp/hf-mount-cache Local cache directory
--cache-size 10000000000 (~10 GB) Max on-disk chunk cache size in bytes
--cache-mode chunk Disk cache layer: chunk (xet-core xorb-range cache) or file (whole-file cache keyed by xet hash, avoids chunk-range fragmentation on warm reloads). Mutually exclusive; file disables the chunk cache.
--max-staging-size 0 (unlimited) Max bytes for advanced-writes staging files before flushed files are garbage-collected (LRU by last-touched). 0 disables GC, so staging files persist as a read-after-write cache. Does not yet cover the HTTP download cache for non-Xet repo files.
--read-only false Mount read-only (always on for repos)
--advanced-writes false Enable staging files + async flush (random writes, seek, overwrite)
--poll-interval-secs 30 Remote change polling interval (0 to disable)
--poll-listing-concurrency 4 Max concurrent tree-listing requests per poll round. Main knob to throttle load on the Hub /api endpoint; lower it in shared environments where many mounts poll in parallel.
--max-threads 16 Maximum FUSE worker threads (Linux only)
--metadata-ttl-ms 10000 How long file metadata is cached before re-checking (ms)
--metadata-ttl-minimal false Re-check on every access (maximum freshness, lower throughput)
--flush-debounce-ms 2000 Advanced writes: flush debounce delay (ms)
--flush-max-batch-window-ms 30000 Advanced writes: max flush batch window (ms)
--flush-shutdown-timeout-ms 45000 Advanced writes: max time the SIGTERM flush drain may run before abandoning unflushed data to guarantee exit. Must be < the pod's terminationGracePeriodSeconds, or a slow Hub/CAS backend keeps the FUSE connection alive past grace and strands the pod.
--no-disk-cache false Disable local chunk cache (every read fetches from HF)
--direct-io false Bypass the kernel page cache (FOPEN_DIRECT_IO); every read goes through the FUSE handler. For benchmarking; not recommended in production (disables efficient mmap caching).
--no-filter-os-files false Stop filtering OS junk files (.DS_Store, Thumbs.db, etc.)
--uid / --gid current user Override UID/GID for mounted files
--fuse-owner-only false Restrict mount access to the mounting user only (FUSE only; by default all users can access, which requires user_allow_other in /etc/fuse.conf)
--token-file Path to a token file (re-read on each request for credential rotation)
--inode-soft-limit 0 Soft cap on the in-memory inode table (0 disables). See "Bounding inode memory" below.
--lru-sweep-interval-ms 5000 Background LRU sweep interval in milliseconds. Only meaningful when --inode-soft-limit > 0.
--overlay false Treat the mount point as a writable local layer over the remote source. Local files persist on disk; writes are never pushed to the remote. See "Overlay mode" below.
--encryption-key-file Path to a 32-byte master key (raw or 64 hex chars). Encrypts contents and names client-side. Requires --features encrypt; implies --advanced-writes. See "Client-side encryption".
--encryption-algorithm aegis-128x2 Content encryption algorithm (only aegis-128x2 is supported today).

Bounding inode memory

Under workloads that enumerate large trees (a find, a documentation scraper, du -sh), the in-memory inode table can grow without bound: every path the kernel ever looked up stays resident. With --inode-soft-limit N set, two evictors cooperate to keep the table near N:

  1. Insert-time evictor (synchronous): before adding a new entry when len() >= N + 256, drop the oldest-touched file/symlink/leaf-directory entries.
    • Polite mode: only entries the kernel has already released (forget-ed). Safe, no FUSE races.
    • Force mode: when above 2 × N and polite found nothing, drop entries even if the kernel still caches the dentry. A racing kernel op sees ENOENT and re-looks up. Dirty files, locally-created dirs/symlinks, and inodes with live file handles are never dropped — the force path preserves all user data.
  2. Background LRU sweep (every --lru-sweep-interval-ms): for inodes the kernel has cached but our table doesn't want, send FUSE_NOTIFY_INVAL_ENTRY so the kernel drops its dentry and sends us forget. Bounded to 1024 invalidations per sweep with EAGAIN backoff so we don't flood the notify channel.

Tuning: pick N below what a full-tree enumeration of your bucket would produce. For hf-doc-build/doc-dev with ~20k files, --inode-soft-limit 10000 keeps sidecar RSS ~250 MiB with a 1 GiB cgroup cap.

Overlay mode

--overlay makes the mount point itself a writable local layer on top of the remote source. Reads return whatever the remote has, plus anything you've put on local disk under the mount point. Writes go only to local disk — the remote is never touched. Local files survive an unmount/remount.

Useful when several machines or processes need to share a read-only remote view but each layer their own files on top — for example, a shared compilation cache where producer machines populate a bucket with compiled artifacts (torch.compile, vLLM, JAX/XLA, AWS Neuron) and every consumer mounts the same bucket with --overlay. Cache hits are served from the bucket without recompiling; cache misses compile locally and stay on the local disk, never pushed back to the bucket.

# Producer (writes compiled artifacts to the bucket — regular bucket mount)
hf-mount start bucket myorg/torch-compile-cache "$TORCHINDUCTOR_CACHE_DIR"

# Consumer (reads from the bucket, compiles locally on miss)
hf-mount start --overlay bucket myorg/torch-compile-cache "$TORCHINDUCTOR_CACHE_DIR"

What you can do:

  • Read every file from the remote source.
  • Read every file already present in the local layer; when a name exists in both, the local copy wins.
  • Create new files and directories — they land in the local layer.
  • Modify, rename, delete, or chmod any file or directory that lives in the local layer.

What you can't do:

  • Modify, rename, delete, or chmod a file that exists only on the remote. These operations fail with a permission error. To diverge from a remote file, copy it under a new name through the mount; the copy is a regular local file you own.
  • Shadow an existing remote name with a new local file once the mount is active. If you need a local file at a name that already exists on the remote, drop it in the mount-point directory before starting the mount — pre-existing files at the mount point stay visible and take precedence.
  • Place symlinks in the local layer and expect them to show up. Symlinks are hidden from the merged view so the mount can't be tricked into reading or writing outside the mount point.

Logging

RUST_LOG=hf_mount=debug hf-mount-fuse repo gpt2 /mnt/gpt2

Features

  • FUSE & NFS backends -- FUSE for standard Linux/macOS, NFS for environments without /dev/fuse
  • Lazy loading -- files are fetched on demand, not eagerly downloaded
  • Subfolder mounting -- mount only a subdirectory (e.g. user/model/ckpt/v2)
  • Simple writes (default) -- append-only, in-memory, synchronous upload on close
  • Advanced writes (--advanced-writes) -- staging files on disk, random writes + seek, async debounced flush
  • Remote sync -- background polling detects remote changes and updates the local view
  • POSIX metadata -- chmod, chown, timestamps, symlinks (in-memory only, lost on unmount)
  • Overlay mode (--overlay) -- mount point doubles as a writable local layer; remote stays read-only
  • Client-side encryption (--features encrypt) -- file contents and names encrypted locally; the Hub only ever sees ciphertext

Client-side encryption

Files can be encrypted on the client before upload and decrypted transparently on read, so the Hub only ever stores ciphertext. Both file contents and the names they are stored under.

Generate a 32-byte master key and point the mount at it:

# 32 random bytes (or 64 hex characters)
head -c 32 /dev/urandom > /path/to/key.bin

hf-mount start --encryption-key-file /path/to/key.bin bucket myuser/encrypted-bucket /tmp/data

All encrypted objects are stored under a single .enc directory at the bucket root, so a raw (keyless) view of the bucket shows only that one entry:

bucket root
└── .enc/                          literal name, never encrypted
    ├── <E("dir")>/                HCTR2 + base91 components
    │   └── <E("dir/file")>
    └── <E("top-level-file")>

Unencrypted files remain at the bucket root as before. The .enc directory itself is not a ciphertext component and is never fed to the path cipher, so encrypted trees can still be moved between buckets without re-encryption.

What is and isn't hidden

Encrypted:

  • File contents -- AEGIS-128X2, authenticated, so tampering or truncation is caught on read.
  • File and directory names -- every path component, each bound to the plaintext path above it, so the server can't move a name into a different directory undetected.

Visible to the server:

  • The directory tree shape -- how deep it is and how many entries each directory holds.
  • Timestamps.
  • Approximate sizes. Content size is rounded up to whole 64 KiB chunks, so any file up to 64 KiB is size-indistinguishable from another; name length leaks only its 32-byte padding bucket.

Encrypted and plaintext objects can coexist in one bucket: a name that doesn't decrypt is skipped, so you only ever see the files that belong to your key.

Plaintext never touches local disk -- the on-disk staging file is a ciphertext container from the moment it is created.

Mixed-content buckets

A single bucket can contain unencrypted files, encrypted files, and files encrypted with different keys — all at once. The layout keeps them partitioned: plaintext files sit at the bucket root as before, while all ciphertext is gathered under .enc/. They never intermingle in one listing.

When mounted with an encryption key, you see both unencrypted files (at the root) and encrypted files that decrypt with your key (inside .enc/). Names inside .enc that don't decrypt are skipped, so multiple keys can share a bucket and each key sees only its own files.

Limitations

  • File names and encrypted file data are not bound to a specific user or bucket. This is intentional and allows buckets to be renamed and synchronized.
  • Similarly, unlike file names, encrypted file data is not bound to paths. This allows files to be moved atomically and copied while taking advantage of XET deduplication.
  • File names are limited to about 194 bytes.

Consistency model

hf-mount provides eventual consistency with remote changes. There is no push notification from the Hub; all freshness relies on client-side polling.

Reads

Files can be stale for up to --metadata-ttl-ms (default 10 s) after a remote update. Two mechanisms detect changes:

  1. Metadata revalidation (FUSE only) -- when the per-file TTL expires, the next access checks the Hub. If the file changed, cached data is invalidated.
  2. Background polling (default every 30 s) -- lists the full tree and detects additions, modifications, and deletions.

Writes

Streaming (default) Advanced (--advanced-writes)
Write pattern Append-only (sequential) Random writes, seek, overwrite
Storage In-memory buffer Local staging file on disk
Modify existing files Overwrite only (O_TRUNC) Yes (downloads file first)
Durability On close Async, debounced (2 s / 30 s max)
Disk space needed None Full file size per open file

Streaming mode buffers writes in memory and uploads on close(). A crash before close means data loss.

Note: Streaming mode does not support text editors (vim, nano, emacs). Editors that use unlink+create save patterns will be blocked (EPERM) to prevent data loss. Use --advanced-writes for interactive editing.

Advanced mode downloads the full file to local disk before allowing edits. After close(), dirty files are flushed asynchronously. A crash before flush completes means data loss.

FUSE vs NFS

FUSE NFS
Metadata revalidation Per-file, within TTL No (NFS uses file handles)
Page cache invalidation Supported Not supported by NFS protocol
Staleness window ~10 s Up to poll interval (30 s)
Write mode Streaming by default Advanced always

How it works

hf-mount sits between your application and the Hugging Face Hub. It presents a standard filesystem interface (FUSE or NFS) and translates file operations into Hub API calls and storage fetches.

Reads go through an adaptive prefetch buffer that starts small and grows with sequential access. Writes are uploaded to HF storage and committed via the Hub API. A background poll loop keeps the local view in sync with remote changes.

Built on xet-core for content-addressed storage and efficient file transfers, and fuser for the FUSE implementation.

Kubernetes

Use the hf-csi-driver to mount Buckets and repos as Kubernetes volumes. The CSI driver runs hf-mount inside a DaemonSet and exposes mounts to pods via the Container Storage Interface.

helm install hf-csi oci://ghcr.io/huggingface/charts/hf-csi-driver

See the hf-csi-driver README for setup and examples.

Testing

# Unit tests (no network, no token)
cargo test --lib --features fuse,nfs

# Integration tests (require HF_TOKEN and FUSE)
HF_TOKEN=... cargo test --release --features fuse,nfs --test fuse_ops -- --test-threads=1 --nocapture
HF_TOKEN=... cargo test --release --features fuse,nfs --test nfs_ops -- --test-threads=1 --nocapture

# Encrypted round-trip over NFS against a live bucket (requires HF_TOKEN)
HF_TOKEN=... cargo test --release --features nfs,encrypt --test encryption_ops -- --test-threads=1 --nocapture

# Repo mount test (public repo, no token needed)
cargo test --release --features nfs --test repo_ops -- --test-threads=1 --nocapture

# Benchmarks
HF_TOKEN=... cargo test --release --features fuse,nfs --test bench -- --nocapture

Troubleshooting

I'm getting Operation not permitted on MacOS while opening or listing files in a mounted bucket using VSCode

You need to enable "Full disk access" to your VSCode (System settings > Privacy > Full disk access).

License

Apache-2.0

About

Mount Hugging Face Buckets and repos as local filesystems, with client-side encryption.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • Rust 99.9%
  • Dockerfile 0.1%