The Cube in a Box is a simple way to run the Open Data Cube. The current repository is a fork of https://github.com/opendatacube/cube-in-a-box with the following major changes:
- Multi-user JupyterHub
- shared folders for collaboration
- Planetary Computer as a STAC datastore
- ODC Explorer
- Traefik integration as a reverse proxy
- Multi-architecture support (AMD64 & ARM64)
- Dask integration for parallel processing
- Admin and User documentation (Quarto)
All the developments have made possible thanks to the financial support of the European Union ‘Horizon Europe Program’ that funded the LandShift (Grant Agreement no. 101182007), Nostradamus (Grant Agreement no. 101134888), and NEMESIS (Grant Agreement no. 101219087) projects.
Makefile: Main entry point for all commands (start, stop, index, etc.).docker-compose.yml: Docker services definition.hub/: Configuration for JupyterHub.Dockerfile: Custom JupyterHub image definition.jupyterhub_config.py: Main configuration file.custom_authenticator.py: Custom logic to restrict signups to authorized users.spawner_hooks.py: Helper functions to configure user environments (volumes, permissions) before spawning.templates/: Custom UI templates (e.g., signup page).
data/: Configuration and data persistence.jupyterhub_data/: JupyterHub database and state.local_data/: Mapped to/local_datain containers.shared/: Read-only shared folder for all users.
docs/: Built documentation (Quarto).quarto/: Source files for documentation.products/: ODC product definitions.
This project is run via docker compose and a Makefile. Before running docker compose commands through make, ensure you have:
- Docker with Docker Compose support
- GNU Make
Below are platform-specific setup instructions.
-
Install Docker Engine
- Install Docker Engine for your distribution (Ubuntu/Debian/Fedora, etc.) using the official Docker instructions.
- Add your
userto thedockergroup so you can run Docker withoutsudo, then log out/in.
-
Install Docker Compose
- Recent Docker Engine installations include the Compose plugin and expose it as
docker compose .... - Verify:
docker --versiondocker compose version
- Recent Docker Engine installations include the Compose plugin and expose it as
-
Install Make
- Install GNU Make using your package manager.
- Verify:
make --version
-
Install Docker Desktop
- Install Docker Desktop for Mac and ensure it is running.
- Verify:
docker --versiondocker compose version
-
Install Make
- macOS typically has
makeavailable via Xcode Command Line Tools. - Install if needed:
xcode-select --install - Verify:
make --version
- macOS typically has
The simplest way to use make on Windows is to run the project inside WSL2 (Windows Subsystem for Linux) while using Docker Desktop as the Docker backend.
-
Install WSL2
- Install WSL2 and a Linux distribution (Ubuntu is a common choice).
- Open your WSL terminal (e.g., Ubuntu).
-
Install Docker Desktop
- Install Docker Desktop for Windows and enable:
- Use WSL 2 based engine
- WSL Integration for your chosen Linux distribution (Settings → Resources → WSL Integration)
- Install Docker Desktop for Windows and enable:
-
Install Make inside WSL
- In your WSL terminal, install GNU Make:
- Debian/Ubuntu:
sudo apt update && sudo apt install -y make
- Debian/Ubuntu:
- Verify:
make --version
- In your WSL terminal, install GNU Make:
-
Verify Docker access from WSL
In your WSL terminal, run:
docker --versiondocker compose version
If these work, WSL is correctly talking to Docker Desktop.
Notes:
- Run all
make ...commands from the WSL terminal (not PowerShell) to ensure a consistent Linux-like environment.- Store the repository inside the WSL filesystem (e.g.,
~/projects/...) for better performance than/mnt/c/....
Once installed, you should be able to run:
make --versiondocker --versiondocker compose version
This repository uses environment variables to configure the local domain, database credentials, and the Jupyter password.
-
Create a
.envfile (Docker Compose reads.envby default):cp .env.default .env
-
Edit
.envto match your setup:- Set strong passwords for
POSTGRES_PASS - Configure
JUPYTERHUB_ADMINSwith admin usernames - Optionally add regular users to
JUPYTERHUB_USERS
- Set strong passwords for
| Variable | Required | Default (as provided) | Example | Description |
|---|---|---|---|---|
DOMAIN |
Yes | localhost |
localhost |
Hostname used to access the web endpoints (Jupyter/Explorer). |
IMAGE_VERSION |
Yes | 20260211 |
20260211 |
Version of the images to use. |
POSTGRES_HOSTNAME |
Yes | postgres |
postgres |
Hostname used to access the PostgreSQL database. |
POSTGRES_PORT |
Yes | 5432 |
5432 |
Port used to access the PostgreSQL database. |
POSTGRES_DBNAME |
Yes | opendatacube |
opendatacube |
PostgreSQL database name used by Open Data Cube. |
POSTGRES_USER |
Yes | opendatacube |
opendatacube |
PostgreSQL user for the Open Data Cube database. |
POSTGRES_PASS |
Yes | opendatacubepassword |
a-strong-password |
PostgreSQL password for the Open Data Cube database. |
JUPYTERHUB_ADMINS |
Yes | admin |
admin,bruno |
Comma-separated list of JupyterHub admin usernames. |
JUPYTERHUB_USERS |
No | guest |
guest,alice,bob |
Comma-separated list of authorized non-admin usernames. |
These variables are automatically set in the environment but can be overridden if needed. They are crucial for mapping host paths to user container volumes when spawning containers from within the JupyterHub container.
| Variable | Description |
|---|---|
HOST_PRODUCTS_DIR |
Host path to the ./products directory. |
HOST_DATA_DIR |
Host path to the ./data/local_data directory. |
HOST_DISTRIBUTED_CONFIG |
Host path to the distributed.yaml file. |
HOST_SHARED_STATIC |
Host path to the ./shared directory. |
HOST_USER_FOLDERS |
Host path to the ./data/shared directory. |
JupyterHub uses NativeAuthenticator with a custom signup handler that restricts access to pre-authorized users only.
But admin users can add new users through the JupyterHub admin panel.
- Authorized Users: Only users listed in
JUPYTERHUB_ADMINSorJUPYTERHUB_USERSin the.envfile, or manually added by an admin user can successfully sign up - Unauthorized Users: Unauthorized users will see an error message directing them to contact the administrator if they try to sign up
- Self-Service Signup: Authorized users can create their own accounts via the signup page
- Admin Creation: Administrators can also create user accounts through the JupyterHub admin panel
Method 1: Via .env file (Recommended for initial setup)
-
Edit the
.envfile:# Admin users (have full control over JupyterHub) JUPYTERHUB_ADMINS=admin,bob # Regular users (can only access their own notebooks) JUPYTERHUB_USERS=guest,charlie
-
Restart JupyterHub to apply changes:
docker-compose restart jupyterhub
-
Users can now visit
http://<DOMAIN>/jupyter/hub/signupto create their accounts
Method 2: Via JupyterHub Admin Panel (For ad-hoc user additions)
- Log in as an admin user
File>Hub Control Panel>Admin, or navigate tohttp://<DOMAIN>/jupyter/hub/admin- Click
Add Users - Enter the username and click
Add - The user is created immediately and can sign up
For Authorized Users:
- Visit
http://<DOMAIN>/jupyter/hub/signup - Fill in username (must match an authorized one), password, and optional email
- Submit the form
- See success message: "The signup was successful! You can now go to the home page and log in to the system."
- Log in at
http://<DOMAIN>/jupyter/hub/login
For Unauthorized Users:
- Contact the administrator to be added
View all users:
- Log in as admin → Navigate to
http://<DOMAIN>/jupyter/hub/admin - You'll see a list of all users with their status and last activity
Edit user:
- Click "Edit User" next to any user
- You can make them admin, delete them, or manage their servers
Delete user:
- Click "Edit User" → "Delete User"
- This removes the user account but doesn't delete their notebook files (stored in
jupyterhub-user-<username>volume)
Password reset or recovery:
- Using the JupyterHub Admin interface, delete the user and re-create it, without deleting user volume
- Inform user he will hav to sign up again
Each user's notebooks are stored in a Docker volume named jupyterhub-user-<username>. These volumes persist even if the user account is deleted.
Backup user notebooks:
docker run --rm -v jupyterhub-user-<username>:/source -v $(pwd)/backups:/backup alpine tar czf /backup/user-<username>-notebooks.tar.gz -C /source .Restore user notebooks:
docker run --rm -v jupyterhub-user-<username>:/target -v $(pwd)/backups:/backup alpine tar xzf /backup/user-<username>-notebooks.tar.gz -C /targetRemove all user volumes:
make purge-users CONFIRM=1Remove a specific user volume:
make purge-user HUB_USER=alice CONFIRM=1- Use strong passwords for admin accounts
- Regularly review the user list in the admin panel
- Remove unused accounts to minimize security risks
- Backup user data regularly (see Backup and Restore section)
- Keep admin users minimal - only trusted users should have admin access
All interaction with the stack is wrapped behind make targets. To see the authoritative list on your machine:
make help| Command | Description |
|---|---|
| Runtime Control | |
make up |
Start the environment in the background (then open Jupyter in your browser) |
make down |
Stop the running services (keeps your data and images) |
make status |
Show what is running (containers and their status) |
make logs |
Show live logs from all services (useful for troubleshooting) |
make docs |
Render Quarto documentation (Admin and User guides) |
make shell |
Open a terminal inside the Jupyter container (requires HUB_USER) |
make wait-for-db |
Wait for PostgreSQL to be ready to accept connections |
| Setup & Init | |
make setup |
First-time setup (mode-dependent: uses pull in prod, build in dev) |
make init |
Initialize the Open Data Cube database (run once after setup) |
make build |
Build the images locally |
make build-nocache |
Build the images locally from scratch |
make pull |
Download all service images (recommended before first run in prod mode) |
| Data & Indexing | |
make product |
Load product definitions into the database (describes available datasets) |
make index |
Index example data for the selected area/time (uses BBOX and DATETIME) |
make index-parallel |
Index data using the automated script (recommended) |
make index-serie |
Index data step-by-step (older method; slower) |
make update-explorer |
Rebuild the Explorer index so datasets appear in the web UI |
| Maintenance | |
make backup |
Create a backup of the PostgreSQL database |
make restore |
Restore PostgreSQL database from a backup file (requires BACKUP_FILE and CONFIRM=1) |
make clean |
Stop everything and remove containers, volumes, and built images |
make purge-data |
Delete local data in ./data (pg, local_data, shared). Irreversible; requires CONFIRM=1 |
make purge-user |
Remove a specific user container and volume. Irreversible; requires HUB_USER and CONFIRM=1 |
make purge-users |
Remove all spawned JupyterHub user containers and volumes. Irreversible; requires CONFIRM=1 |
| Advanced | |
make release-push |
Build and push multi-architecture production images to the configured container registry |
make help |
Show available commands |
-
First-time setup (default parameters):
make setup
-
Setup with a specific area/time (BBOX, DATETIME):
# Switzerland 1 year make setup BBOX=5.95,45.81,10.50,47.81 DATETIME=2024-01-01/2024-12-31 # Switzerland all years (till end 2025, might take a while) make setup BBOX=5.95,45.81,10.50,47.81 DATETIME=1984-01-01/2025-12-31
-
Start/stop and troubleshoot:
make up make status make logs make down
-
Reset options (use with care):
# Stop everything and remove containers/volumes/images make clean # Irreversible: delete local data in ./data (requires confirmation) make purge-data CONFIRM=1 # ⚠️ TOTAL WIPE OUT (USE WITH EXTRA CARE !!!) make clean && make purge-data CONFIRM=1 && make purge-users CONFIRM=1 # then check eventual remains docker ps -a && docker images -a && docker volume ls && ls -la ./data
-
Dev mode (local builds):
# Set dev mode for the entire session export MODE=dev make * # Go back to prod mode unset MODE # One-off dev invocation (not recommended as it might requires to be repeated in several commands) make up MODE=dev
- JupyterHub is available on:
http://<DOMAIN>/jupyter/(Use NativeAuthenticator for login - admin users defined inJUPYTERHUB_ADMINS) - Explorer is available on:
http://<DOMAIN>/explorer
Detailed documentation is available in the docs/ directory (built using Quarto):
- Admin Guide: docs/admin/index.html
- User Guide: docs/user/index.html
This stack uses Traefik v3 as a reverse proxy. It handles routing based on the hostname (DOMAIN) and path prefixes (/jupyter, /explorer). Traefik also manages the internal docker network for service communication.
JupyterHub uses the DockerSpawner with a Docker-out-of-Docker (DooD) pattern. The JupyterHub container has access to the host's /var/run/docker.sock, allowing it to spawn user notebook containers directly on the host machine. This ensures that user environments are isolated and can be managed by standard Docker tools.
The environment is pre-configured for Dask parallel processing. The distributed.yaml file ensures that the Dask dashboard is accessible through the Jupyter proxy at /jupyter/proxy/{port}/status.
This directory is shared among users of the JupyterHub instance.
The primary purpose of this folder is to facilitate file sharing and collaboration between users.
The /notebooks/shared directory contains:
- Static Content: Any file or directory in the
./sharedfolder on the host is mounted here as Read-Only at the exception of the user own folder. - User Folders: The
all_usersdirectory contains individual user folders.
- Static Content: Files directly under
/notebooks/shared/(from the host./sharedfolder) are read-only for everyone in the Jupyter interface. To modify them, an admin must edit them on the host machine. - User Folders: Under
/notebooks/shared/all_users/, you can see other users' folders (read-only) and your own folder (read-write). This allows you to copy notebooks from others but not modify their work directly.
To work on a shared notebook, copy it to your own workspace e.g.:
cp -r /notebooks/shared/notebooks_demo ~/my_notebooks_democp /notebooks/shared/all_users/alice/analysis.ipynb ~/from_alice.ipynb
- Visibility: Content in this folder is visible to all users.
- Data Safety: Do not place sensitive credentials or private data in this directory.
To create a backup of your PostgreSQL database:
make backupThis will create a timestamped SQL dump file in the ./backups directory (e.g., ./backups/opendatacube_20260121_141530.sql).
To restore a database from a backup file:
make restore BACKUP_FILE=./backups/opendatacube_20260121_141530.sql CONFIRM=1
⚠️ WARNING: Restoring will overwrite your current database. Make sure you have a recent backup before proceeding.
The following directories contain persistent data and should be backed up regularly:
./data/pg/- PostgreSQL database files./data/local_data/- Local data cache./data/jupyterhub_data/- JupyterHub configuration and user data- User notebooks are stored in Docker volumes named
jupyterhub-user-<username>
Manual volume backup:
# Backup user notebooks
docker run --rm -v jupyterhub-user-<username>:/source -v $(pwd)/backups:/backup alpine tar czf /backup/user-<username>-notebooks.tar.gz -C /source .
# Backup all data directories
tar czf backups/data-backup-$(date +%Y%m%d).tar.gz ./data/Restore user notebooks:
# Restore user notebooks
docker run --rm -v jupyterhub-user-<username>:/target -v $(pwd)/backups:/backup alpine tar xzf /backup/user-<username>-notebooks.tar.gz -C /targetThis project is licensed under the MIT License.
Copyright (c) 2018 Alex Leith Copyright © 2025 UNIGE/GRID-Geneva
You are free to use, modify, and distribute this software under the terms of the MIT License. For more details, see the full license text: MIT.