Mergen is a full-stack manuscript layout analysis app that runs three YOLO models in parallel, combines results into a unified annotation set, and provides interactive review and export tools.
- Detects manuscript layout/content elements with 3 models: emanuskript, catmus, zone.
- Supports single-image and batch ZIP workflows.
- Returns unified COCO annotations with class filtering.
- Produces annotated image previews and aggregate statistics.
- Exports COCO JSON, annotated images, ZIP bundles, and PAGE XML.
- Includes authenticated analytics endpoints for usage reporting.
- Frontend: Next.js 16 (App Router), React 19, TypeScript.
- Backend: FastAPI + Uvicorn.
- Inference: Ultralytics YOLO models executed in multiprocessing pool.
- Reverse proxy (container deployment): Caddy.
- Alternative host deployment: systemd + nginx via deploy.sh.
The deployment script can download the required model weights directly from OwnCloud:
https://owncloud.gwdg.de/index.php/s/PyQ2nN6aKpypKfG?path=%2FApps%2FLayout%20App%2Fmodel%20weights
Place these files in backend/models:
- best_emanuskript_segmentation.pt
- best_catmus.pt
- best_zone_detection.pt
If you are deploying with ./deploy.sh, it will download these weights automatically into backend/models and re-download them if validation detects corruption.
If deployment still fails with PytorchStreamReader failed reading zip archive, the checkpoint download or local file is still corrupted and should be fetched again.
The backend exposes 22 final COCO classes, including:
- Layout: Border, Table, Diagram, Column
- Script: Main script black/coloured, Variant script black/coloured
- Initials: Historiated, Inhabited, Zoo - Anthropomorphic, Embellished, Plain initial variants
- Navigation/content: Page Number, Quire Mark, Running header, Catchword, Gloss, Illustrations
- Music: Music
Base prefix: /api
- GET /health
- GET /classes
- POST /predict/single
- POST /predict/batch
- GET /predict/batch/{task_id}/progress (SSE)
- GET /predict/batch/{task_id}/results
- GET /download/{task_id}/coco_json
- GET /download/{task_id}/annotated_image
- GET /download/{task_id}/annotated/{index}
- GET /download/{task_id}/results_zip
- GET /download/{task_id}/page_xml
- POST /analytics/login
- GET /analytics/data (JWT required)
Prerequisites:
- Python 3.11+
- Node.js 20+
- npm
git clone https://github.com/emanuskript/Mergen.git
cd MergenCreate backend/models and copy the three .pt files listed above.
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip setuptools wheel
pip install --index-url https://download.pytorch.org/whl/cpu torch torchvision
pip install -e backend
cd backend
MODEL_DIR="$(pwd)/models" CORS_ORIGINS="http://localhost:3000" uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 1Backend URL: http://localhost:8000 Docs: http://localhost:8000/docs
In a new terminal:
cd frontend
npm install
BACKEND_URL=http://localhost:8000 npm run dev -- --hostname 0.0.0.0 --port 3000Frontend URL: http://localhost:3000/analyze
You can deploy with either Docker Compose (recommended) or direct host deployment script.
- Ensure backend/model files are present.
- Optionally set environment values:
- SITE_ADDRESS (for Caddy, default :80)
- JWT_SECRET
- ANALYTICS_USERNAME
- ANALYTICS_PASSWORD
Run:
docker compose up -d --buildThis starts:
- backend service (FastAPI)
- frontend service (Next.js standalone)
- caddy (reverse proxy on ports 80/443)
The repository includes deploy.sh, which installs dependencies, builds frontend, provisions systemd services, and configures nginx.
Run from project root:
chmod +x deploy.sh
./deploy.shThe script defaults the backend inference pool to a single worker on direct host deployments to keep low-memory CPU VMs stable. You can override that when needed, for example:
BACKEND_MAX_POOL_WORKERS=2 ./deploy.shBackend environment variables:
- MODEL_DIR (defaults to backend/models)
- CORS_ORIGINS (comma-separated, default http://localhost:3000)
- MAX_POOL_WORKERS (default 3)
- JWT_SECRET
- ANALYTICS_USERNAME
- ANALYTICS_PASSWORD
Frontend build/runtime:
- INTERNAL_BACKEND_URL (used by the standalone frontend runtime when /api/* is not already proxied at the public edge)
- BACKEND_URL (legacy fallback supported by next.config.ts)
Default credentials are configured in backend/app/config.py and should be changed in production:
- username: admin
- password: layout2024
.
├── backend/
│ ├── app/
│ ├── models/
│ └── pyproject.toml
├── frontend/
├── docker-compose.yml
├── Caddyfile
└── deploy.sh
Licensed under Apache 2.0. See LICENSE.