diff --git a/AGENTS.md b/AGENTS.md index cbe60e9e1f..e90f32387a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -130,6 +130,14 @@ records for DTOs/events/value objects · `default!` for required non-nullable st - **Module** — new `Modules.{Name}` + `.Contracts`, implement `IModule` w/ assembly-level `[assembly: FshModule(typeof(XModule), order)]`, register in **all four places**, add migration folder + tests. Details: `architecture.md`. - **React page** — API module (`src/api/`) → page → register lazy route → (admin) mirror permission + RouteGuard → Playwright test. Details: `frontend/shared.md`. +## Worklog (deployment/ops journal) + +Real deployment and ops sessions are journaled in **`worklog/`** — one +`YYYY-MM-DD-.md` per session plus a self-contained `index.html` dashboard. +**When a session does deployment/infra/ops work, append a session log and refresh +the dashboard before finishing** (conventions in `worklog/README.md`; never store +secrets there). The reusable deploy runbook is `deploy/docker/DEPLOY-VPS.md`. + ## AI tooling resources - **Rules** — `.agents/rules/*.md` (indexed above). Read on demand. diff --git a/deploy/docker/DEPLOY-VPS.md b/deploy/docker/DEPLOY-VPS.md new file mode 100644 index 0000000000..da4700493d --- /dev/null +++ b/deploy/docker/DEPLOY-VPS.md @@ -0,0 +1,399 @@ +# Deploying to a fresh VPS — battle-tested runbook + +A start-to-finish guide for deploying this project on a plain Ubuntu VPS (written on a +Hostinger KVM 2 — 2 vCPU / 8 GB RAM / 100 GB disk — Ubuntu 24.04, but any provider works). +It records every command that was actually run, every error hit along the way, and the fix +for each, so a redeploy on a new server is copy-paste. + +**Placeholders used throughout — replace every occurrence:** + +| Placeholder | Meaning | Example from the first deploy | +|---|---|---| +| `` | Your server's public IPv4 | `2.25.69.231` | +| `` | Your domain | `sabinstack.cloud` | + +The three public surfaces end up at `api.`, `admin.`, `app.`. + +--- + +## Architecture at a glance + +What the finished deployment looks like. Only Caddy is reachable from the internet; the +three app containers are loopback-bound, and the data plane has no published ports at all. + +```mermaid +flowchart TB + user(["User's browser"]) + + dns["DNS provider + A records: api / admin / app → VPS_IP"] + + subgraph vps["Ubuntu VPS — VPS_IP"] + fw["ufw firewall — only 22, 80, 443 open"] + caddy["Caddy reverse proxy + auto-HTTPS via Let's Encrypt"] + + subgraph stack["Docker Compose stack (fsh)"] + api["fsh-api — ASP.NET Core + 127.0.0.1:8080"] + admin["fsh-admin — React admin + 127.0.0.1:8081"] + dash["fsh-dashboard — React tenant + 127.0.0.1:8082"] + mig["fsh-migrator — one-shot + migrate + seed, then exits"] + subgraph data["Data plane — no published ports"] + pg[("fsh-postgres")] + rd[("fsh-redis")] + mn[("fsh-minio")] + end + end + end + + user -. "1 — resolve api / admin / app .DOMAIN" .-> dns + user -- "2 — HTTPS :443" --> fw --> caddy + caddy -- "api.DOMAIN → :8080" --> api + caddy -- "admin.DOMAIN → :8081" --> admin + caddy -- "app.DOMAIN → :8082" --> dash + api --> pg & rd & mn + mig --> pg +``` + +## Deployment flow (with the errors we hit) + +The happy path top to bottom, with the two failure branches from the first deploy and +where they re-join. + +```mermaid +flowchart TD + s1["1 — ssh in, apt update + upgrade, reboot"] --> s2["2 — install Docker via get.docker.com"] + s2 --> s3["3 — git clone + create .env + secrets via openssl rand"] + s3 --> s4["4 — docker compose up -d --build + ~4 min first time"] + s4 --> e1{"fsh-postgres healthy?"} + e1 -- "no — log says: Error: in 18+ these + Docker images ..." --> f1["FIX: mount pg_data at + /var/lib/postgresql + docker compose down -v && up -d"] + f1 --> s5 + e1 -- yes --> s5["5 — DNS: three A records → VPS_IP + wait until dig +short shows the IP"] + s5 --> s6["6 — install Caddy, write Caddyfile, + systemctl reload caddy"] + e2 -- yes --> s7["7 — .env: https URLs + + 127.0.0.1: port prefixes, + force-recreate api admin dashboard"] + s6 --> e2{"journalctl shows + certificate obtained?"} + e2 -- "no — placeholder yourdomain.com + left in Caddyfile" --> f2["FIX: real domain in + /etc/caddy/Caddyfile, + reload caddy"] + f2 --> s7 + s7 --> s8["8 — ufw allow 22, 80, 443 + enable"] + s8 --> s9["9 — verify: curl health endpoints, + sign in, rotate admin password"] + s9 --> done(["deployed ✔"]) +``` + +--- + +## 0. What you need + +- A VPS: 2+ vCPU, 4+ GB RAM (8 GB comfortable), ~10 GB free disk. Root SSH access. +- A domain you control (any registrar). Required — the SPAs + API need three HTTPS + subdomains; raw-IP HTTP breaks CORS/cookies and is test-only. +- ~45 minutes, most of it waiting for the first Docker build. + +## 1. SSH in and update the OS + +```bash +ssh root@ +apt update && apt upgrade -y +reboot # if the upgrade installed a new kernel — do it now, before anything runs +# wait ~30s, then ssh back in +``` + +## 2. Install Docker + +```bash +curl -fsSL https://get.docker.com | sh +docker compose version # expect v2.x +``` + +## 3. Clone the repo and create `.env` + +```bash +git clone https://github.com/sabinshrestha/dotnet-starter-kit.git +cd ~/dotnet-starter-kit/deploy/docker +cp .env.example .env +``` + +Generate the secrets: + +```bash +openssl rand -base64 48 # → JWT_SIGNING_KEY +openssl rand -hex 16 # run 4× → POSTGRES_PASSWORD, REDIS_PASSWORD, + # MINIO_ROOT_PASSWORD, HANGFIRE_PASSWORD +``` + +Edit `nano .env` and fill in (final production values — see §7 for why the ports get the +`127.0.0.1:` prefix): + +```ini +FSH_API_URL=https://api. +FSH_ADMIN_URL=https://admin. +FSH_DASHBOARD_URL=https://app. + +FSH_API_PORT=127.0.0.1:8080 +FSH_ADMIN_PORT=127.0.0.1:8081 +FSH_DASHBOARD_PORT=127.0.0.1:8082 + +JWT_SIGNING_KEY= +SEED_ADMIN_PASSWORD= +HANGFIRE_USERNAME=hangfire +HANGFIRE_PASSWORD= +POSTGRES_PASSWORD= +REDIS_PASSWORD= +MINIO_ROOT_USER=minioadmin +MINIO_ROOT_PASSWORD= +``` + +> `.env` holds all secrets. Never commit it; keep a copy somewhere safe — losing +> `POSTGRES_PASSWORD` against an existing data volume is painful (see troubleshooting). + +## 4. Build and launch the stack + +```bash +cd ~/dotnet-starter-kit/deploy/docker +docker compose up -d --build +``` + +First build compiles the whole .NET solution + both React apps inside Docker: +**~4 minutes on 2 vCPUs** (measured). Later runs are cached and take seconds. + +Watch the migrator apply migrations and seed the root tenant + admin user: + +```bash +docker compose logs -f migrator # wait for "[migrator] finished successfully."; Ctrl+C to exit +``` + +Expected (harmless) noise in that log: + +- Two `ERR ... SELECT "MigrationId" ... "__EFMigrationsHistory"` lines — EF probing for + its history table before creating it on a virgin database. First-run only. +- `Cannot load library libgssapi_krb5.so.2` — the Npgsql driver probing for Kerberos, + which isn't in the slim image. Password auth is used; ignore. + +Verify: + +```bash +docker compose ps # 6 services Up; migrator + minio-init Exited (0) +curl -fsS http://localhost:8080/health/live # {"status":"Healthy",...} +``` + +### ⚠ Error we hit here: `fsh-postgres` unhealthy / crashes ~1s after start + +``` +✘ Container fsh-postgres Error dependency postgres failed to start +dependency failed to start: container fsh-postgres is unhealthy +``` + +`docker compose logs postgres` shows `Error: in 18+, these Docker images are configured +to store database data in ...`. + +**Cause:** the `postgres:18` images moved PGDATA to `/var/lib/postgresql//docker` +and their entrypoint refuses to start when a volume is mounted at the legacy +`/var/lib/postgresql/data` path. + +**Fix** (already applied to `docker-compose.yml` in this repo — only needed if you're on +an older checkout where the volume line still says `pg_data:/var/lib/postgresql/data`): + +```bash +docker compose down -v # DESTRUCTIVE — fine on first boot, never on a live DB +sed -i 's#pg_data:/var/lib/postgresql/data#pg_data:/var/lib/postgresql#' docker-compose.yml +docker compose up -d +``` + +## 5. Point DNS at the VPS + +At your registrar/DNS panel, add three **A records**, all → ``: + +| Type | Name | Points to | +|---|---|---| +| A | `api` | `` | +| A | `admin` | `` | +| A | `app` | `` | + +Wait for propagation (usually minutes), and **do not continue until**: + +```bash +dig +short api. # each must print +dig +short admin. +dig +short app. +``` + +## 6. Install Caddy (reverse proxy + automatic HTTPS) + +```bash +apt install -y debian-keyring debian-archive-keyring apt-transport-https curl +curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg +curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | tee /etc/apt/sources.list.d/caddy-stable.list +apt update && apt install -y caddy +``` + +Replace `/etc/caddy/Caddyfile` entirely with (**use your real domain — see the error box +below**): + +``` +api. { + reverse_proxy localhost:8080 +} + +admin. { + reverse_proxy localhost:8081 +} + +app. { + reverse_proxy localhost:8082 +} +``` + +```bash +systemctl reload caddy +# ~30s later, confirm three "certificate obtained successfully" lines: +journalctl -u caddy --since "2 min ago" | grep -i certificate +``` + +WebSockets (SignalR chat/notifications) pass through `reverse_proxy` automatically. + +### ⚠ Error we hit here: left the placeholder domain in the Caddyfile + +``` +"error":"HTTP 400 urn:ietf:params:acme:error:tls - ... remote error: tls: no application protocol" +"job failed","error":"admin.yourdomain.com: obtaining certificate: ..." +``` + +**Cause:** the Caddyfile was pasted with the literal placeholder `yourdomain.com` still in +it, so Let's Encrypt tried (and failed forever) to validate a domain we don't control. + +**Fix:** edit `/etc/caddy/Caddyfile`, replace every hostname with the real domain, +`systemctl reload caddy`. Certificates were issued within seconds of the fix. + +## 7. Apply the public URLs to the app + +If you set the final values in §3 and haven't started with test URLs, just recreate the +three app containers so they re-read the environment (no rebuild needed — the URLs are +baked into `/config.json` at container **start**, and CORS is plain API env): + +```bash +cd ~/dotnet-starter-kit/deploy/docker +docker compose up -d --force-recreate api admin dashboard +``` + +Why `127.0.0.1:` in front of the ports: **Docker's published ports bypass ufw** (Docker +writes its own iptables rules), so a firewall alone does NOT block 8080–8082 from the +internet. Binding them to loopback makes them unreachable from outside regardless of +firewall state; Caddy still reaches them via `localhost`. + +## 8. Firewall + +```bash +ufw allow 22/tcp # FIRST — or you lock yourself out of SSH +ufw allow 80/tcp +ufw allow 443/tcp +ufw enable # answer y +ufw status verbose +``` + +If your provider has a cloud firewall (e.g. Hostinger hPanel → VPS → Firewall rules), +either leave it disabled or allow the same three ports there — it filters *outside* the +VM, so it wins over anything configured inside. + +## 9. Final verification + +```bash +curl -fsS https://api./health/live # {"status":"Healthy",...} +curl -fsSI https://admin. | head -1 # HTTP/2 200 +curl -fsSI https://app. | head -1 # HTTP/2 200 +curl -m 5 http://:8081 ; echo blocked # should time out (loopback binding works) +``` + +Sign in at `https://admin.`: + +- **email** `admin@root.com` · **tenant** `root` · **password** = `SEED_ADMIN_PASSWORD` +- This is the only seeded account. **Rotate the password immediately** + (Settings → Security), then create real users from the admin app. +- Hangfire dashboard: `https://api./jobs` (login = `HANGFIRE_USERNAME`/`_PASSWORD`). +- API reference: `https://api./scalar`. + +## How a request flows once deployed + +Useful mental model when debugging: the React apps run in the user's browser and call the +API **through Caddy at the public URL** (`FSH_API_URL`) — never container-to-container. +That's why the `.env` URLs must exactly match what the browser sees (CORS). + +```mermaid +sequenceDiagram + participant B as Browser + participant D as DNS + participant C as Caddy (VPS :443) + participant A as fsh-admin (nginx) + participant P as fsh-api + participant DB as fsh-postgres + + B->>D: resolve admin.DOMAIN + D-->>B: VPS_IP + B->>C: GET https://admin.DOMAIN + C->>A: proxy → localhost:8081 + A-->>B: React app + /config.json (contains FSH_API_URL) + Note over B: user submits login form + B->>C: POST https://api.DOMAIN/api/token + C->>P: proxy → localhost:8080 + P->>DB: verify credentials (tenant root) + DB-->>P: ok + P-->>B: JWT access token + Note over B,P: all further API calls: Bearer token via api.DOMAIN +``` + +--- + +## Troubleshooting — every issue from the first deploy + +| Symptom | Cause | Fix | +|---|---|---| +| `fsh-postgres` unhealthy, dies ~1s after start; log says `Error: in 18+, these Docker images...` | postgres 18 image rejects a volume mounted at legacy `/var/lib/postgresql/data` | Mount at `/var/lib/postgresql` (§4). Already fixed in this repo's compose file. | +| Caddy logs `tls: no application protocol` / cert `job failed` for `yourdomain.com` | Placeholder domain left in the Caddyfile | Put the real domain in `/etc/caddy/Caddyfile`, `systemctl reload caddy` (§6). | +| Site loads from some networks but `ERR_CONNECTION_TIMED_OUT` from others (e.g. an office network), while `curl` on the VPS returns 200 | Corporate/ISP web filter blocking a **newly-registered domain** (or the TLD). VPS-local curls succeed because they never leave the machine. | Nothing to fix server-side. Confirm via phone on mobile data or `check-host.net` TCP check on `:443`. Wait 24–72 h for domain categorization or ask IT to whitelist. Diagnose with `tcpdump -ni any 'tcp port 443'` — no packets during a browser reload ⇒ blocked upstream. | +| `ERR ... "__EFMigrationsHistory"` in migrator log | EF checks for its history table before creating it | First-run noise; ignore. | +| `Cannot load library libgssapi_krb5.so.2` | Npgsql probing Kerberos, absent in slim image | Ignore; password auth is used. | +| `xxx_PASSWORD is required` at `compose up` | Empty required var in `.env` | The error names the var; fill it. | +| `OptionsValidationException: SigningKey looks like a sample placeholder` | `JWT_SIGNING_KEY` contains `replace-with` | Generate a real key: `openssl rand -base64 48`. | +| Browser CORS error on the admin/dashboard app | An `FSH_*_URL` in `.env` doesn't exactly match the URL in the address bar (scheme/host, no trailing slash) | Fix `.env`, `docker compose up -d --force-recreate api admin dashboard`. | +| Migrator retries Postgres ~2 min then dies | Usually `POSTGRES_PASSWORD` changed against an existing `pg_data` volume | Restore the old password, or (data loss!) `docker compose down -v` and reseed. | +| Ports 8080–8082 reachable from the internet despite ufw | Docker's iptables rules bypass ufw | Loopback-bind the ports in `.env` (§7). | + +## Day-2 operations + +```bash +# Update to latest code (migrator re-runs idempotently before the API restarts) +cd ~/dotnet-starter-kit && git pull +cd deploy/docker && docker compose up -d --build + +# Logs +docker compose logs -f api # or admin / dashboard / postgres / redis / minio + +# Restart everything (volumes/data untouched) +docker compose restart + +# Backup the three stateful volumes (all persistent state lives here) +for v in fsh_pg_data fsh_redis_data fsh_minio_data; do + docker run --rm -v $v:/source:ro -v "$PWD":/backup alpine \ + tar czf /backup/$v-$(date +%Y%m%d).tar.gz -C /source . +done +``` + +Recommended extras: your provider's snapshot/backup add-on, and disabling SSH password +auth once your key is installed (`PasswordAuthentication no` in `/etc/ssh/sshd_config`, +then `systemctl restart ssh`). diff --git a/deploy/docker/README.md b/deploy/docker/README.md index bcb1304593..9159af4bd1 100644 --- a/deploy/docker/README.md +++ b/deploy/docker/README.md @@ -8,7 +8,7 @@ This brings up the full stack on a single host: | `admin` | `fsh/admin:local` | `FSH_ADMIN_PORT` (default 8081) | Operator console (nginx + React) | | `dashboard` | `fsh/dashboard:local` | `FSH_DASHBOARD_PORT` (default 8082) | Tenant dashboard (nginx + React) | | `migrator` | `fsh/dbmigrator:local` | — | One-shot: applies EF migrations + seeds the root tenant + creates the default admin user | -| `postgres` | `postgres:17-alpine` | (internal) | Identity, tenant catalog, module schemas | +| `postgres` | `postgres:18-alpine` | (internal) | Identity, tenant catalog, module schemas | | `redis` | `redis:7-alpine` | (internal) | HybridCache L2, Data Protection keys, idempotency store | | `minio` | `minio/minio:latest` | (internal) | S3-compatible blob store for the Files module | diff --git a/deploy/docker/docker-compose.yml b/deploy/docker/docker-compose.yml index d43c744f5b..f35d920624 100644 --- a/deploy/docker/docker-compose.yml +++ b/deploy/docker/docker-compose.yml @@ -19,7 +19,9 @@ services: POSTGRES_USER: fsh POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?POSTGRES_PASSWORD is required} volumes: - - pg_data:/var/lib/postgresql/data + # 18+ images keep PGDATA under /var/lib/postgresql//docker and + # refuse to start with a volume mounted at the legacy .../data path. + - pg_data:/var/lib/postgresql - ./postgres-init:/docker-entrypoint-initdb.d:ro healthcheck: test: ["CMD-SHELL", "pg_isready -U fsh -d fsh"] diff --git a/worklog/2026-07-02-vps-deployment.md b/worklog/2026-07-02-vps-deployment.md new file mode 100644 index 0000000000..05e31a641e --- /dev/null +++ b/worklog/2026-07-02-vps-deployment.md @@ -0,0 +1,126 @@ +# 2026-07-02 — First production deployment to Hostinger VPS + +**Server:** `srv1784663.hstgr.cloud` / `2.25.69.231` — Hostinger KVM 2 (2 vCPU, 8 GB RAM, 100 GB NVMe), Ubuntu 24.04 +**Domain:** `sabinstack.cloud` → `api.` / `admin.` / `app.` subdomains +**Outcome:** full stack live over HTTPS; DB tooling (pgAdmin, DBeaver) wired up over SSH tunnels; repo fixes on PR [#1](https://github.com/sabinshrestha/dotnet-starter-kit/pull/1) + +--- + +## What was done (chronological) + +1. **OS prep** — `apt update && apt upgrade -y`, reboot for new kernel (6.8.0-134). +2. **Docker** — installed via `curl -fsSL https://get.docker.com | sh` (Engine 29.6.1). +3. **Clone + configure** — cloned repo, `cp .env.example .env` in `deploy/docker/`, + generated secrets (`openssl rand -base64 48` for JWT, `openssl rand -hex 16` for the rest). +4. **First launch** — `docker compose up -d --build` (build took ~3.7 min). + ❌ `fsh-postgres` crashed — see Error 1 below. Fixed, relaunched, migrator seeded successfully. +5. **DNS** — three A records (`api`, `admin`, `app`) → `2.25.69.231` in Hostinger DNS + (domain also carries Hostinger mail/parking records — untouched, they don't conflict). +6. **Caddy** — installed from the official apt repo; Caddyfile with the three subdomains + reverse-proxying to 8080/8081/8082. + ❌ First attempt used the literal placeholder `yourdomain.com` — see Error 2. Fixed; + Let's Encrypt certificates issued for all three names at 00:40 UTC. +7. **Production URLs** — `.env` switched to `https://` URLs; app ports rebound to + loopback (`FSH_API_PORT=127.0.0.1:8080` etc.); `docker compose up -d --force-recreate api admin dashboard`. +8. **Firewall** — `ufw allow 22,80,443/tcp && ufw enable`. Hostinger cloud firewall left at 0 rules. + ❌ Site unreachable from one network — see Error 3 (not actually a server problem). +9. **DB tooling:** + - Removed a Hostinger-catalog Adminer (publicly exposed on `:32768`, wrong Docker + network — couldn't resolve `fsh-postgres`) and a catalog pgAdmin (same problems). + - pgAdmin installed properly as its own compose project (`~/pgadmin/docker-compose.yml`): + image `dpage/pgadmin4`, bound `127.0.0.1:8084:80`, joined to external network `fsh_default`. + - Postgres given a loopback-only host port: `127.0.0.1:5432:5432` in `deploy/docker/docker-compose.yml`. + - DBeaver on the PC connects via its built-in SSH tunnel (key auth) → `localhost:5432`. +10. **Repo changes** (branch `claude/hostinger-vps-hosting-kfl6zu`, PR #1): + postgres 18 volume fix, `deploy/docker/DEPLOY-VPS.md` runbook with mermaid diagrams, + this worklog. + +## Errors hit & fixes + +### Error 1 — `fsh-postgres` unhealthy, dies ~1 s after start + +``` +✘ Container fsh-postgres Error dependency postgres failed to start +Error: in 18+, these Docker images are configured to store database data in ... +``` + +**Cause:** `postgres:18` images moved PGDATA to `/var/lib/postgresql/18/docker` and refuse +a volume mounted at the legacy `/var/lib/postgresql/data` path. +**Fix:** mount `pg_data:/var/lib/postgresql` instead; `docker compose down -v` (safe — +first boot, empty DB) and relaunch. Committed to the repo so fresh clones don't hit it. + +### Error 2 — Caddy certificate failures for `yourdomain.com` + +``` +"error":"HTTP 400 urn:ietf:params:acme:error:tls - ... tls: no application protocol" +``` + +**Cause:** placeholder `yourdomain.com` pasted into `/etc/caddy/Caddyfile` unchanged. +**Fix:** real hostnames (`*.sabinstack.cloud`), `systemctl reload caddy` — certs issued in seconds. + +### Error 3 — `ERR_CONNECTION_TIMED_OUT` from the office network only + +Server-side curls returned 200; phone on mobile data worked; one network timed out. +**Cause:** corporate web filter blocking a **newly-registered domain** — nothing +server-side. Diagnosed by elimination + (optionally) `tcpdump -ni any 'tcp port 443'` +showing no inbound SYNs. **Fix:** wait for domain categorization (24–72 h) or ask IT +to whitelist. Documented in the runbook so future deploys don't chase ghosts. + +### Gotchas confirmed along the way + +- **Docker-published ports bypass ufw** → anything that must be private is bound to + `127.0.0.1`, never relying on the firewall. +- **Hostinger catalog apps** publish on random public ports on their own Docker + network → don't use the catalog for DB tools; self-install with loopback + `fsh_default`. +- **Hostinger Docker Manager** only lists *compose projects* — plain `docker run` + containers are invisible there (run tools as mini compose projects instead). +- SSH `Permission denied` while already on the VPS = tunnel command typed in the wrong + window; tunnels always run **on the PC**. Key passphrase ≠ server password. + +## How to look up credentials (no secrets stored here!) + +All secrets live in **`~/dotnet-starter-kit/deploy/docker/.env` on the VPS** (never in git): + +```bash +grep POSTGRES_PASSWORD ~/dotnet-starter-kit/deploy/docker/.env # DB password (user fsh, db fsh) +grep SEED_ADMIN_PASSWORD ~/dotnet-starter-kit/deploy/docker/.env # app admin (admin@root.com, tenant root) +grep -E 'HANGFIRE|MINIO|REDIS|JWT' ~/dotnet-starter-kit/deploy/docker/.env +``` + +pgAdmin's own login is set in `~/pgadmin/docker-compose.yml` (`PGADMIN_DEFAULT_*`). +App admin password: rotated in-app after first login — the `.env` seed value is bootstrap-only. + +## Daily-use commands + +```bash +# DB browser (pgAdmin) — run on the PC, keep window open, browse http://localhost:8084 +ssh -L 8084:localhost:8084 root@2.25.69.231 + +# DBeaver: SSH tab → 2.25.69.231/root/key ; Main tab → localhost:5432, db fsh, user fsh + +# Update the app to latest main +cd ~/dotnet-starter-kit && git pull && cd deploy/docker && docker compose up -d --build + +# Logs / status +docker compose -f ~/dotnet-starter-kit/deploy/docker/docker-compose.yml ps +docker compose -f ~/dotnet-starter-kit/deploy/docker/docker-compose.yml logs -f api +``` + +## Risk register (state at end of session) + +| # | Risk | Severity | Status / mitigation | +|---|---|---|---| +| 1 | SSH password auth still enabled (`passwordauthentication yes`) — internet bots brute-force root continuously | **High** | **OPEN — next action.** Key login verified working; set `PasswordAuthentication no` + `PermitRootLogin prohibit-password`, keep hPanel browser terminal as recovery door | +| 2 | No backups configured (Hostinger snapshot count: 0; no volume backups) | **High** | **OPEN.** Enable Hostinger auto-backup ($6/mo) and/or cron the volume `tar` loop from the runbook | +| 3 | Secrets exist only in `.env` on the VPS — server loss = secrets loss | Medium | **OPEN.** Keep an offline copy in a password manager | +| 4 | `localhost:5432` now reachable by any process on the VPS (loopback publish for DBeaver) | Low | Accepted — strong 32-char password; keep untrusted software off the box | +| 5 | Direct prod-DB editing via pgAdmin/DBeaver can corrupt app-managed state | Low | Set DBeaver connection type to *Production*; prefer the admin app for writes | +| 6 | PR #1 unmerged — fresh clones of `main` still hit the postgres 18 crash | Medium | Merge PR #1 | +| 7 | Root admin seed password was in `.env` and typed around | Low | Rotated in-app (Settings → Security) — verify done | + +## Current state snapshot + +- **Public surface:** ports 22, 80, 443 only. TLS by Caddy (auto-renew). +- **Compose projects:** `fsh` (8 containers), `pgadmin` (1). Adminer & catalog apps removed. +- **Idle RAM:** ~437 MB of 8 GB. Disk: ~9 % of 96 GB. +- **URLs:** https://admin.sabinstack.cloud · https://app.sabinstack.cloud · https://api.sabinstack.cloud (`/scalar`, `/jobs`) diff --git a/worklog/README.md b/worklog/README.md new file mode 100644 index 0000000000..5d2ed81764 --- /dev/null +++ b/worklog/README.md @@ -0,0 +1,30 @@ +# Worklog + +Running operations journal for this project's real deployments and infra work. + +## Structure + +- `YYYY-MM-DD-.md` — one markdown log per working session: what was done, + every command that mattered, every error hit with its fix, risks identified, + and the state the system was left in. +- `index.html` — self-contained visual dashboard (open in any browser, no + server needed): architecture diagram, deployment timeline, charts, risk + register, and quick-reference commands. Updated alongside the session logs. + +## Conventions (for humans and AI sessions alike) + +1. **Every working session appends here.** New session → new + `YYYY-MM-DD-.md` + update `index.html` (session timeline, risk + register, current-state numbers) before finishing. +2. **Never store secrets.** No passwords, keys, or tokens — only *where* they + live and *how* to look them up (e.g. `grep POSTGRES_PASSWORD deploy/docker/.env` + on the VPS). +3. **Errors are the most valuable entries.** Record the exact error text, the + root cause, and the fix — future deployments grep this folder first. +4. **Keep `index.html` self-contained** — inline CSS/JS/SVG only, no CDN links. + +## Related docs + +- `deploy/docker/DEPLOY-VPS.md` — the reusable step-by-step deployment runbook + (distilled from these logs; update it when a session discovers something + every future deploy needs). diff --git a/worklog/index.html b/worklog/index.html new file mode 100644 index 0000000000..5a14e0ab0e --- /dev/null +++ b/worklog/index.html @@ -0,0 +1,442 @@ + + + + + +sabinstack.cloud — Deployment Worklog + + + +
+ +

sabinstack.cloud — deployment worklog

+

FullStackHero .NET Starter Kit on a Hostinger KVM 2 VPS · journal of real sessions, errors, fixes and risks

+
+ srv1784663 · 2.25.69.231 + Ubuntu 24.04 · Docker 29 + 2 vCPU · 8 GB RAM · 100 GB + Caddy + Let's Encrypt + Last update: 2026-07-02 +
+ +
+
+
HTTPS surfaces live
+
3 / 3
+
api · admin · app — certs auto-renew
+
+
+
Containers running
+
7
+
fsh ×6 + pgAdmin (2 one-shots exited 0)
+
+
+
Idle RAM — stack total
+
437 MB
+
of 8 GB (5.3%)
+
+
+
+
Public ports open
+
3
+
22 (SSH) · 80 · 443 — everything else loopback/internal
+
+
+ +

Architecture

+
+ + + + User's browser + https://…sabinstack.cloud + + + Your PC + DBeaver · pgAdmin tab + + + DNS + api / admin / app → VPS + + + + UBUNTU VPS — 2.25.69.231 · ufw: only 22 / 80 / 443 + + + + + Caddy :443 + TLS · routes by subdomain + + + + sshd :22 + key auth · tunnels + + + + fsh-api + 127.0.0.1:8080 + + + fsh-admin + 127.0.0.1:8081 + + + fsh-dashboard + 127.0.0.1:8082 + + + pgAdmin + 127.0.0.1:8084 + + + + DATA PLANE — docker network fsh_default only, no public ports + + fsh-postgres +127.0.0.1:5432 + + fsh-redis + + fsh-minio + + migrator ✓ + + + + + + + + + + + + + + + + + resolves to VPS IP + proxy by subdomain + ssh -L tunnels + +

Solid blue = public request path (HTTPS). Dashed = admin path over SSH tunnels. Dotted boxes = trust boundaries.

+
+ +

Idle memory per container — measured 2026-07-02, Hostinger container stats

+
+
+
+ +
+ 0100 + 200300 + MB +
+ +
+

The API is the only heavy process — the whole stack idles at ~5% of the 8 GB machine, leaving ample headroom for real traffic and future modules.

+
+ Data table + + + +
ContainerIdle RAM (MB)
+
+
+ +

Session timeline — 2026-07-02

+
+
    +
  1. 21:53 — SSH in · apt upgrade · reboot to kernel 6.8.0-134
  2. +
  3. 21:57 — Docker Engine 29.6.1 installed · repo cloned · .env secrets generated
  4. +
  5. ~22:05First launch fails: fsh-postgres unhealthy +
    Postgres 18 image rejects legacy volume path → remount at /var/lib/postgresql, down -v, relaunch
  6. +
  7. 00:10 — Migrator seeds root tenant, permissions, admin user · stack healthy on test URLs
  8. +
  9. ~00:30 — DNS A records for api/admin/app.sabinstack.cloud · Caddy installed
  10. +
  11. 00:40Cert failures: placeholder yourdomain.com left in Caddyfile +
    Real hostnames + systemctl reload caddy → 3 certificates issued seconds later
  12. +
  13. 00:45.env → https URLs, ports rebound to 127.0.0.1 · containers recreated · ufw enabled
  14. +
  15. ~01:00Timeout from office network only +
    Diagnosed as corporate filter on the newly-registered domain — server proven healthy from mobile & VPS
  16. +
  17. 15:30 — Catalog Adminer/pgAdmin removed (public ports, wrong network) · pgAdmin re-installed loopback-only on fsh_default
  18. +
  19. 15:50 — Postgres loopback port 5432 added · DBeaver connected via SSH tunnel · security review done
  20. +
  21. 16:00 — Repo: postgres fix + runbook + diagrams + this worklog → PR #1
  22. +
+
+ +

Errors & fixes (grep these first on the next deploy)

+
+
+

1 · fsh-postgres dies at start

+
Error: in 18+, these Docker images are
+configured to store database data in …
+

postgres:18 moved PGDATA; legacy mount path …/data is refused.

+

Fix: mount pg_data:/var/lib/postgresql (in repo since PR #1).

+
+
+

2 · Let's Encrypt "no application protocol"

+
acme:error:tls … yourdomain.com
+

Placeholder domain left in /etc/caddy/Caddyfile.

+

Fix: real hostnames, systemctl reload caddy.

+
+
+

3 · ERR_CONNECTION_TIMED_OUT (one network)

+
curl on VPS: 200 · office: timeout
+mobile data: works
+

Corporate filter blocks newly-registered domains. Not a server issue.

+

Fix: wait 24–72 h for categorization / ask IT. Verify with tcpdump 'tcp port 443'.

+
+
+ +

Risk register

+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
RiskStatusAction
SSH password auth still enabled — bots brute-force root 24/7Open · HighPasswordAuthentication no + PermitRootLogin prohibit-password; hPanel terminal stays as recovery door
No backups (snapshots 0, no volume dumps)Open · HighEnable Hostinger auto-backup and/or cron the volume tar loop from the runbook
PR #1 unmerged — fresh clones still hit the postgres 18 crashOpen · MediumMerge PR #1 into main
Secrets live only in .env on the VPSOpen · MediumCopy to a password manager
Loopback 5432 reachable by any process on the VPSAccepted · LowStrong password; keep untrusted software off the box
Direct prod-DB edits can corrupt app-managed stateAccepted · LowDBeaver connection type "Production"; prefer the admin app for writes
Public app ports / catalog tools exposureClosedEverything rebound to 127.0.0.1; catalog Adminer/pgAdmin removed
+
+ +

Quick reference — passwords & daily commands

+
+
+

Where the secrets live (never in this repo)

+
# on the VPS — every credential:
+grep POSTGRES_PASSWORD  ~/dotnet-starter-kit/deploy/docker/.env
+grep SEED_ADMIN_PASSWORD ~/dotnet-starter-kit/deploy/docker/.env
+grep -E 'HANGFIRE|MINIO|REDIS|JWT' \
+  ~/dotnet-starter-kit/deploy/docker/.env
+
+# pgAdmin's own login:
+grep PGADMIN ~/pgadmin/docker-compose.yml
+

DB login: host fsh-postgres (or tunneled localhost:5432) · user fsh · db fsh. App admin: admin@root.com · tenant root.

+
+
+

Daily driver commands

+
# pgAdmin (run on the PC, keep open):
+ssh -L 8084:localhost:8084 root@2.25.69.231
+#   → browse http://localhost:8084
+
+# update the app:
+cd ~/dotnet-starter-kit && git pull
+cd deploy/docker && docker compose up -d --build
+
+# status / logs:
+docker compose ps
+docker compose logs -f api
+

DBeaver: SSH tab → 2.25.69.231/root/key · Main tab → localhost:5432.

+
+
+ +

Sessions

+
+ + + + + + + + + + +
DateLogSummary
2026-07-022026-07-02-vps-deployment.mdFirst production deploy: full stack + HTTPS on sabinstack.cloud; 3 errors diagnosed & fixed; DB tooling over SSH tunnels; runbook + fixes pushed as PR #1.
+
+ +
Self-contained dashboard — no external requests. Update alongside each session log (see worklog/README.md conventions).
+
+ +
+ + +