A local S3-compatible server for your files. Find duplicates, verify integrity, zero config.
Prerequisites: Docker must be installed for the recommended method. Check with docker --version.
# Docker (recommended)
docker pull ghcr.io/deepjoy/shoebox:latest
# Or via Cargo (no Docker needed)
cargo install shoebox# Point Shoebox at a directory
shoebox ~/Photos
# Or with Docker
docker run -it --rm -p 9000:9000 -v ~/Photos:/photos ghcr.io/deepjoy/shoebox /photos
# Output:
# Serving 1 bucket on http://localhost:9000
# photos → /home/user/PhotosFiles already on disk appear in S3 immediately — no uploading required. Credentials are generated on first run and printed in the output. To enable browser access (CORS), follow the on-screen instructions — or use the AWS CLI:
# Configure credentials (printed on first run)
aws configure --profile shoebox
# List objects
aws --profile shoebox --endpoint-url http://localhost:9000 s3 ls s3://photos/- S3-compatible API — works with AWS CLI, rclone, and any S3 SDK out of the box
- Zero-config startup — just point at directories, no cloud account or configuration needed
- Duplicate detection — find and merge duplicate files and directories via content hashing
- Integrity verification — scheduled checks to detect bit rot and data corruption
- Filesystem sync — background scanning with move detection, real-time file watching
- Authentication — AWS Signature V4, per-bucket credentials, pre-signed URLs
- Multipart uploads — full support for large file uploads
- CORS — browser-based clients work out of the box
- Webhook notifications — get notified on object events (put, delete, copy)
- Single binary, ~18MB — no runtime dependencies
Shoebox hashes every file (SHA-256) in the background. Finding duplicates is a query:
$ shoebox duplicates ~/Photos --format table
Duplicate groups (2 groups, 5 files, 3 duplicates):
Hash (SHA-256) Size Files
─────────────────────────────────────────────
a]3f…c8d1 32 B 3 copies
originals/sunset.txt
backup/sunset.txt ← duplicate
edited/sunset-copy.txt ← duplicate
7b2e…f104 26 B 2 copies
originals/mountain.txt
backup/mountain.txt ← duplicateA companion browser UI is available at https://deepjoy.github.io/shoebox-webapp/.
Browse buckets, view objects, and see duplicate groups visually — no CLI needed. The webapp talks directly to your local Shoebox server via the S3 API.
CORS setup (required for browser access) — Shoebox prints this command on startup, just copy and run it:
export AWS_ACCESS_KEY_ID='<from startup output>'
export AWS_SECRET_ACCESS_KEY='<from startup output>'
export BUCKET='photos'
curl -X PUT "http://localhost:9000/${BUCKET}?cors" \
--aws-sigv4 "aws:amz:us-east-1:s3" \
--user "$AWS_ACCESS_KEY_ID:$AWS_SECRET_ACCESS_KEY" \
-H "Content-Type: application/json" \
-d '[{"allowed_origins":["*"],"allowed_methods":["GET","PUT","POST","DELETE","HEAD"],"allowed_headers":["*"],"expose_headers":["ETag","x-amz-request-id"],"max_age_seconds":3600}]'- Developers — test S3 integrations without cloud dependencies, work offline
- Home users — expose NAS storage to S3-compatible backup tools, find duplicates with a single query
- Archivists — verify file integrity with content hashes, detect bit rot
- Privacy-conscious users — keep files local, no account required, no telemetry
| Concern | Cloud S3 | MinIO | SeaweedFS | Garage | Shoebox |
|---|---|---|---|---|---|
| Primary strength | Scalability, AWS ecosystem | High performance, enterprise | Small files, high throughput | Simplicity, geo-replication | Existing files, zero config |
| Best for | Production workloads | AI/ML, large data (TB/PB) | Data lakes, file storage | Edge/distributed, low ops | Local dev, NAS, home lab |
| Architecture | Managed service | Specialized nodes | Master/volume servers | Homogeneous nodes | Single process |
| Setup | Account + IAM | Docker + config | Docker + config | Docker + config | Single command |
| Data location | Cloud | MinIO data dir | SeaweedFS volumes | Garage data dir | Your existing files |
| File visibility | S3 only | S3 only | S3, FUSE, WebDAV | S3 only | Filesystem + S3 |
| Offline use | No | Yes | Yes | Yes | Yes |
| Binary size | N/A | ~100MB | ~40MB | ~25MB | ~18MB |
| Duplicate detection | No | No | No | No | Built-in |
| Integrity checks | Yes (default checksums) | Yes (bitrot healing) | Limited (CRC) | Yes (scrub) | Built-in (scheduled) |
| Max recommended scale | Unlimited | Petabytes | Petabytes | Petabytes | ~10TB |
See docs/why-shoebox.md for the full story.
See docs/when-not-to-use-shoebox.md for an honest assessment of limitations, including:
- Strong consistency requirements
- Distributed / multi-node storage
- >10TB of data
- Enterprise S3 features (object lock, lifecycle policies, versioning)
- High-throughput ingestion (thousands of files/second)
- Quickstart — Running in 5 minutes
- Installation — Docker, cargo install, from source
- User Guides — Configuration, credentials, S3 compatibility, and more
See CONTRIBUTING.md for development setup and guidelines.
See SECURITY.md for the security model and how to report vulnerabilities.
MIT
Shoebox operates directly on your existing files — it does not copy data into a separate storage directory. S3 operations like DeleteObject and PutObject will modify or remove real files on disk. Back up anything irreplaceable before use. This is pre-1.0 software provided "as is" with no warranty. See LICENSE for details. The authors are not liable for any data loss.
I had 2TB of photos across 3 drives — backups of backups, originals I was afraid to delete. I set out to find duplicate photos and accidentally designed a local S3 server. If an object store knows the content hash of every file, duplicates are just a query. This is a personal project built in public — expect breaking changes before 1.0. If you have thoughts on the approach, open an issue or start a discussion.
