Mount Hugging Face Buckets and repos as local filesystems. No download, no copy, no waiting.
This fork includes transparent client-side encryption.
Bucket creation:
hf buckets create my-bucketGenerate a 32-byte secret key in a file named key and point the mount at it:
# 32 random bytes (or 64 hex characters)
head -c 32 /dev/urandom > key.binhf mount start --encryption-key-file key.bin bucket myuser/my-bucket /tmp/dataSee "Client-side encryption" below for more information.
Commands will pick up your HF_TOKEN from the environment, or you can pass it explicitly with --hf-token.
Then use your local folders as usual:
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("/tmp/gpt-oss") # reads on demand, no download stephf-mount exposes Hugging Face Buckets and Hub repos as a local filesystem via FUSE or NFS. Files are fetched lazily on first read, so only the bytes your code actually touches ever hit the network.
Two backends are available:
- NFS (recommended) -- works everywhere, no root, no kernel extension
- FUSE -- tighter kernel integration, requires root or macFUSE on macOS
Agentic storage: Agents don't require complex APIs or SDKs, they thrive on the filesystem: ls, cat, find, grep, and the power of composable UNIX pipelines.
brew install hf-mountOn macOS, this installs the NFS backend only (hf-mount, hf-mount-nfs). For the FUSE backend on macOS, download the binary manually or build from source — macFUSE is closed-source and not distributable through homebrew-core.
Binaries are available on GitHub Releases:
| Platform | Daemon | NFS | FUSE |
|---|---|---|---|
| Linux x86_64 | hf-mount-x86_64-linux |
hf-mount-nfs-x86_64-linux |
hf-mount-fuse-x86_64-linux |
| Linux aarch64 | hf-mount-aarch64-linux |
hf-mount-nfs-aarch64-linux |
hf-mount-fuse-aarch64-linux |
| macOS Apple Silicon | hf-mount-arm64-apple-darwin |
hf-mount-nfs-arm64-apple-darwin |
hf-mount-fuse-arm64-apple-darwin |
The NFS backend has no system dependencies. For FUSE:
Linux: sudo apt-get install -y fuse3 (pre-built binaries only need the runtime; building from source also requires libfuse3-dev)
macOS: install macFUSE (brew install macfuse, requires reboot on first install)
Requires Rust 1.89+.
# NFS only (no system deps, works everywhere)
cargo build --release --features nfs
# FUSE (requires macFUSE on macOS, fuse3 on Linux)
cargo build --release --features fuse
# All backends
cargo build --release --features fuse,nfsBinaries: target/release/hf-mount, target/release/hf-mount-nfs, target/release/hf-mount-fuse
Best for:
- Loading models and datasets without downloading the full repo
- Browsing repo contents (
ls,cat,find) without cloning - Read-heavy ML workloads (training, inference, evaluation)
- Environments where disk space is limited
Not for:
- General-purpose networked filesystem (no multi-writer support, no cross-node file locking)
- Latency-sensitive random I/O (first reads require network round-trips)
- Workloads that need strong consistency (files can be stale for up to 10 s)
- Heavy concurrent writes from multiple mounts (last writer wins, no conflict detection)
- Editing files with text editors in default (streaming) mode (use
--advanced-writes)
Advisory file locks (flock, fcntl POSIX record locks) are supported locally on a single mount on both backends — enough for Python filelock, huggingface_hub, datasets, and similar cache-coordination use cases within one machine. They are not coordinated across multiple clients.
See Consistency model for details.
# Public model (no token needed)
hf-mount start repo openai/gpt-oss-20b /tmp/model
# Private model
hf-mount start --hf-token $HF_TOKEN repo myorg/my-private-model /tmp/model
# Dataset
hf-mount start repo datasets/open-index/hacker-news /tmp/hn
# Specific revision
hf-mount start repo openai-community/gpt2 /tmp/gpt2 --revision v1.0
# Subfolder only
hf-mount start repo openai-community/gpt2/onnx /tmp/onnxBuckets are S3-like object storage on the Hub, designed for large-scale mutable data (training checkpoints, logs, artifacts) without git version control.
hf-mount start --hf-token $HF_TOKEN bucket myuser/my-bucket /tmp/data
# Read-only
hf-mount start --hf-token $HF_TOKEN --read-only bucket myuser/my-bucket /tmp/data
# Subfolder only
hf-mount start --hf-token $HF_TOKEN bucket myuser/my-bucket/checkpoints /tmp/ckptshf-mount status # list running mounts
hf-mount stop /tmp/data # stop and unmountLogs are written to ~/.hf-mount/logs/. PID files are stored in ~/.hf-mount/pids/.
By default, hf-mount uses NFS. Pass --fuse for tighter kernel integration (page cache invalidation, per-file metadata revalidation). Requires fuse3 on Linux or macFUSE on macOS.
hf-mount start --fuse --hf-token $HF_TOKEN bucket myuser/my-bucket /mnt/dataFor scripts, containers, or debugging, use the backend binaries directly (they run in the foreground):
hf-mount-nfs repo gpt2 /tmp/gpt2
hf-mount-fuse --hf-token $HF_TOKEN bucket myuser/my-bucket /mnt/dataTo have hf-mount start automatically on login, create a LaunchAgent:
label=co.huggingface.hf-mount
mkdir -p ~/Library/LaunchAgents
cat > ~/Library/LaunchAgents/$label.plist <<EOF
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>$label</string>
<key>ProgramArguments</key>
<array>
<string>$HOME/.local/bin/hf-mount-nfs</string>
<string>repo</string>
<string>openai/gpt-oss-20b</string>
<string>/tmp/gpt-oss</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>StandardOutPath</key>
<string>/tmp/hf-mount.log</string>
<key>StandardErrorPath</key>
<string>/tmp/hf-mount.log</string>
</dict>
</plist>
EOF
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/$label.plistTo stop: launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/$label.plist
umount /tmp/data # NFS or FUSE (macOS)
fusermount -u /tmp/data # FUSE (Linux)
hf-mount stop /tmp/data # daemon mountsOn SIGTERM the sidecar bounds the dirty-data flush (--flush-shutdown-timeout-ms) and disarms the kernel-cache invalidators before draining. This is deliberate: a FUSE_NOTIFY_INVAL_INODE writev issued during teardown can block uninterruptibly in the kernel (waiting on a folio under writeback the exiting daemon can no longer complete), leaving a D-state thread that even exit_group can't reap — an unkillable pod. Disarming the invalidator avoids creating that wedge; the CSI driver aborting the FUSE connection on NodeUnpublishVolume is the kernel-level backstop for any already-in-flight notify.
The same FUSE_NOTIFY_INVAL_INODE writev can also wedge at runtime (not just on shutdown): when the poll loop detects a remote change to a file the app currently has open, a full page-cache invalidation blocks in-kernel on a folio lock held by the app's in-flight read(), which is itself waiting for the daemon — a deadlock. Two guards prevent this: invalidations targeting an inode with open handles drop attributes only (a negative-offset notify the kernel never lets touch pages), and the blocking writev runs on the runtime's blocking pool rather than a core worker, so it can never starve FUSE request servicing. The trade-off: a file with a long-lived open handle won't see remote content updates refreshed in its page cache until the handle closes. On close-and-reopen the kernel revalidates the (attr-only-invalidated) attributes and, via the negotiated FUSE_AUTO_INVAL_DATA, drops the stale pages itself when it sees the new mtime/size — so a stale read only persists if a remote content change preserves both mtime and size on a file that was open at the moment of the change.
| Flag | Default | Description |
|---|---|---|
--hf-token |
$HF_TOKEN |
HF API token (required for private repos/buckets) |
--hub-endpoint |
https://huggingface.co |
Hub API endpoint |
--cache-dir |
/tmp/hf-mount-cache |
Local cache directory |
--cache-size |
10000000000 (~10 GB) |
Max on-disk chunk cache size in bytes |
--cache-mode |
chunk |
Disk cache layer: chunk (xet-core xorb-range cache) or file (whole-file cache keyed by xet hash, avoids chunk-range fragmentation on warm reloads). Mutually exclusive; file disables the chunk cache. |
--max-staging-size |
0 (unlimited) |
Max bytes for advanced-writes staging files before flushed files are garbage-collected (LRU by last-touched). 0 disables GC, so staging files persist as a read-after-write cache. Does not yet cover the HTTP download cache for non-Xet repo files. |
--read-only |
false |
Mount read-only (always on for repos) |
--advanced-writes |
false |
Enable staging files + async flush (random writes, seek, overwrite) |
--poll-interval-secs |
30 |
Remote change polling interval (0 to disable) |
--poll-listing-concurrency |
4 |
Max concurrent tree-listing requests per poll round. Main knob to throttle load on the Hub /api endpoint; lower it in shared environments where many mounts poll in parallel. |
--max-threads |
16 |
Maximum FUSE worker threads (Linux only) |
--metadata-ttl-ms |
10000 |
How long file metadata is cached before re-checking (ms) |
--metadata-ttl-minimal |
false |
Re-check on every access (maximum freshness, lower throughput) |
--flush-debounce-ms |
2000 |
Advanced writes: flush debounce delay (ms) |
--flush-max-batch-window-ms |
30000 |
Advanced writes: max flush batch window (ms) |
--flush-shutdown-timeout-ms |
45000 |
Advanced writes: max time the SIGTERM flush drain may run before abandoning unflushed data to guarantee exit. Must be < the pod's terminationGracePeriodSeconds, or a slow Hub/CAS backend keeps the FUSE connection alive past grace and strands the pod. |
--no-disk-cache |
false |
Disable local chunk cache (every read fetches from HF) |
--direct-io |
false |
Bypass the kernel page cache (FOPEN_DIRECT_IO); every read goes through the FUSE handler. For benchmarking; not recommended in production (disables efficient mmap caching). |
--no-filter-os-files |
false |
Stop filtering OS junk files (.DS_Store, Thumbs.db, etc.) |
--uid / --gid |
current user | Override UID/GID for mounted files |
--fuse-owner-only |
false |
Restrict mount access to the mounting user only (FUSE only; by default all users can access, which requires user_allow_other in /etc/fuse.conf) |
--token-file |
Path to a token file (re-read on each request for credential rotation) | |
--inode-soft-limit |
0 |
Soft cap on the in-memory inode table (0 disables). See "Bounding inode memory" below. |
--lru-sweep-interval-ms |
5000 |
Background LRU sweep interval in milliseconds. Only meaningful when --inode-soft-limit > 0. |
--overlay |
false |
Treat the mount point as a writable local layer over the remote source. Local files persist on disk; writes are never pushed to the remote. See "Overlay mode" below. |
--encryption-key-file |
Path to a 32-byte master key (raw or 64 hex chars). Encrypts contents and names client-side. Requires --features encrypt; implies --advanced-writes. See "Client-side encryption". |
|
--encryption-algorithm |
aegis-128x2 |
Content encryption algorithm (only aegis-128x2 is supported today). |
Under workloads that enumerate large trees (a find, a documentation scraper, du -sh), the in-memory inode table can grow without bound: every path the kernel ever looked up stays resident. With --inode-soft-limit N set, two evictors cooperate to keep the table near N:
- Insert-time evictor (synchronous): before adding a new entry when
len() >= N + 256, drop the oldest-touched file/symlink/leaf-directory entries.- Polite mode: only entries the kernel has already released (
forget-ed). Safe, no FUSE races. - Force mode: when above
2 × Nand polite found nothing, drop entries even if the kernel still caches the dentry. A racing kernel op sees ENOENT and re-looks up. Dirty files, locally-created dirs/symlinks, and inodes with live file handles are never dropped — the force path preserves all user data.
- Polite mode: only entries the kernel has already released (
- Background LRU sweep (every
--lru-sweep-interval-ms): for inodes the kernel has cached but our table doesn't want, sendFUSE_NOTIFY_INVAL_ENTRYso the kernel drops its dentry and sends usforget. Bounded to 1024 invalidations per sweep with EAGAIN backoff so we don't flood the notify channel.
Tuning: pick N below what a full-tree enumeration of your bucket would produce. For hf-doc-build/doc-dev with ~20k files, --inode-soft-limit 10000 keeps sidecar RSS ~250 MiB with a 1 GiB cgroup cap.
--overlay makes the mount point itself a writable local layer on top of the remote source. Reads return whatever the remote has, plus anything you've put on local disk under the mount point. Writes go only to local disk — the remote is never touched. Local files survive an unmount/remount.
Useful when several machines or processes need to share a read-only remote view but each layer their own files on top — for example, a shared compilation cache where producer machines populate a bucket with compiled artifacts (torch.compile, vLLM, JAX/XLA, AWS Neuron) and every consumer mounts the same bucket with --overlay. Cache hits are served from the bucket without recompiling; cache misses compile locally and stay on the local disk, never pushed back to the bucket.
# Producer (writes compiled artifacts to the bucket — regular bucket mount)
hf-mount start bucket myorg/torch-compile-cache "$TORCHINDUCTOR_CACHE_DIR"
# Consumer (reads from the bucket, compiles locally on miss)
hf-mount start --overlay bucket myorg/torch-compile-cache "$TORCHINDUCTOR_CACHE_DIR"What you can do:
- Read every file from the remote source.
- Read every file already present in the local layer; when a name exists in both, the local copy wins.
- Create new files and directories — they land in the local layer.
- Modify, rename, delete, or chmod any file or directory that lives in the local layer.
What you can't do:
- Modify, rename, delete, or chmod a file that exists only on the remote. These operations fail with a permission error. To diverge from a remote file, copy it under a new name through the mount; the copy is a regular local file you own.
- Shadow an existing remote name with a new local file once the mount is active. If you need a local file at a name that already exists on the remote, drop it in the mount-point directory before starting the mount — pre-existing files at the mount point stay visible and take precedence.
- Place symlinks in the local layer and expect them to show up. Symlinks are hidden from the merged view so the mount can't be tricked into reading or writing outside the mount point.
RUST_LOG=hf_mount=debug hf-mount-fuse repo gpt2 /mnt/gpt2- FUSE & NFS backends -- FUSE for standard Linux/macOS, NFS for environments without
/dev/fuse - Lazy loading -- files are fetched on demand, not eagerly downloaded
- Subfolder mounting -- mount only a subdirectory (e.g.
user/model/ckpt/v2) - Simple writes (default) -- append-only, in-memory, synchronous upload on close
- Advanced writes (
--advanced-writes) -- staging files on disk, random writes + seek, async debounced flush - Remote sync -- background polling detects remote changes and updates the local view
- POSIX metadata -- chmod, chown, timestamps, symlinks (in-memory only, lost on unmount)
- Overlay mode (
--overlay) -- mount point doubles as a writable local layer; remote stays read-only - Client-side encryption (
--features encrypt) -- file contents and names encrypted locally; the Hub only ever sees ciphertext
Files can be encrypted on the client before upload and decrypted transparently on read, so the Hub only ever stores ciphertext. Both file contents and the names they are stored under.
Generate a 32-byte master key and point the mount at it:
# 32 random bytes (or 64 hex characters)
head -c 32 /dev/urandom > /path/to/key.bin
hf-mount start --encryption-key-file /path/to/key.bin bucket myuser/encrypted-bucket /tmp/dataAll encrypted objects are stored under a single .enc directory at the bucket root, so a raw (keyless) view of the bucket shows only that one entry:
bucket root
└── .enc/ literal name, never encrypted
├── <E("dir")>/ HCTR2 + base91 components
│ └── <E("dir/file")>
└── <E("top-level-file")>
Unencrypted files remain at the bucket root as before. The .enc directory itself is not a ciphertext component and is never fed to the path cipher, so encrypted trees can still be moved between buckets without re-encryption.
What is and isn't hidden
Encrypted:
- File contents -- AEGIS-128X2, authenticated, so tampering or truncation is caught on read.
- File and directory names -- every path component, each bound to the plaintext path above it, so the server can't move a name into a different directory undetected.
Visible to the server:
- The directory tree shape -- how deep it is and how many entries each directory holds.
- Timestamps.
- Approximate sizes. Content size is rounded up to whole 64 KiB chunks, so any file up to 64 KiB is size-indistinguishable from another; name length leaks only its 32-byte padding bucket.
Encrypted and plaintext objects can coexist in one bucket: a name that doesn't decrypt is skipped, so you only ever see the files that belong to your key.
Plaintext never touches local disk -- the on-disk staging file is a ciphertext container from the moment it is created.
A single bucket can contain unencrypted files, encrypted files, and files encrypted with different keys — all at once. The layout keeps them partitioned: plaintext files sit at the bucket root as before, while all ciphertext is gathered under .enc/. They never intermingle in one listing.
When mounted with an encryption key, you see both unencrypted files (at the root) and encrypted files that decrypt with your key (inside .enc/). Names inside .enc that don't decrypt are skipped, so multiple keys can share a bucket and each key sees only its own files.
- File names and encrypted file data are not bound to a specific user or bucket. This is intentional and allows buckets to be renamed and synchronized.
- Similarly, unlike file names, encrypted file data is not bound to paths. This allows files to be moved atomically and copied while taking advantage of XET deduplication.
- File names are limited to about 194 bytes.
hf-mount provides eventual consistency with remote changes. There is no push notification from the Hub; all freshness relies on client-side polling.
Files can be stale for up to --metadata-ttl-ms (default 10 s) after a remote update. Two mechanisms detect changes:
- Metadata revalidation (FUSE only) -- when the per-file TTL expires, the next access checks the Hub. If the file changed, cached data is invalidated.
- Background polling (default every 30 s) -- lists the full tree and detects additions, modifications, and deletions.
| Streaming (default) | Advanced (--advanced-writes) |
|
|---|---|---|
| Write pattern | Append-only (sequential) | Random writes, seek, overwrite |
| Storage | In-memory buffer | Local staging file on disk |
| Modify existing files | Overwrite only (O_TRUNC) | Yes (downloads file first) |
| Durability | On close | Async, debounced (2 s / 30 s max) |
| Disk space needed | None | Full file size per open file |
Streaming mode buffers writes in memory and uploads on close(). A crash before close means data loss.
Note: Streaming mode does not support text editors (vim, nano, emacs). Editors that use unlink+create save patterns will be blocked (
EPERM) to prevent data loss. Use--advanced-writesfor interactive editing.
Advanced mode downloads the full file to local disk before allowing edits. After close(), dirty files are flushed asynchronously. A crash before flush completes means data loss.
| FUSE | NFS | |
|---|---|---|
| Metadata revalidation | Per-file, within TTL | No (NFS uses file handles) |
| Page cache invalidation | Supported | Not supported by NFS protocol |
| Staleness window | ~10 s | Up to poll interval (30 s) |
| Write mode | Streaming by default | Advanced always |
hf-mount sits between your application and the Hugging Face Hub. It presents a standard filesystem interface (FUSE or NFS) and translates file operations into Hub API calls and storage fetches.
Reads go through an adaptive prefetch buffer that starts small and grows with sequential access. Writes are uploaded to HF storage and committed via the Hub API. A background poll loop keeps the local view in sync with remote changes.
Built on xet-core for content-addressed storage and efficient file transfers, and fuser for the FUSE implementation.
Use the hf-csi-driver to mount Buckets and repos as Kubernetes volumes. The CSI driver runs hf-mount inside a DaemonSet and exposes mounts to pods via the Container Storage Interface.
helm install hf-csi oci://ghcr.io/huggingface/charts/hf-csi-driverSee the hf-csi-driver README for setup and examples.
# Unit tests (no network, no token)
cargo test --lib --features fuse,nfs
# Integration tests (require HF_TOKEN and FUSE)
HF_TOKEN=... cargo test --release --features fuse,nfs --test fuse_ops -- --test-threads=1 --nocapture
HF_TOKEN=... cargo test --release --features fuse,nfs --test nfs_ops -- --test-threads=1 --nocapture
# Encrypted round-trip over NFS against a live bucket (requires HF_TOKEN)
HF_TOKEN=... cargo test --release --features nfs,encrypt --test encryption_ops -- --test-threads=1 --nocapture
# Repo mount test (public repo, no token needed)
cargo test --release --features nfs --test repo_ops -- --test-threads=1 --nocapture
# Benchmarks
HF_TOKEN=... cargo test --release --features fuse,nfs --test bench -- --nocaptureI'm getting
Operation not permittedon MacOS while opening or listing files in a mounted bucket using VSCode
You need to enable "Full disk access" to your VSCode (System settings > Privacy > Full disk access).
Apache-2.0
