This document describes the threat model, current mitigations, and operator responsibilities for running Ghost Alpha Bot safely.
- Bot token (
TELEGRAM_BOT_TOKEN): full control of bot identity. - User data (DB): Telegram chat IDs, alerts, journal/portfolio entries, giveaway state, wallet entries (if saved).
- Infrastructure credentials: Postgres/Redis URLs and secrets.
- System availability: preventing spam/flooding and expensive workloads from taking the bot down.
- Telegram → Webhook/Long-polling → Bot handlers: untrusted input (messages, callback data).
- HTTP API (FastAPI):
- Public endpoints:
/health,/ready, Telegram webhook path. - Restricted endpoints:
/metrics,/admin/stats,/tasks/*,/test/mock-price(test mode only).
- Public endpoints:
- Outbound HTTP to exchanges, RSS/news sources, and LLM providers: untrusted upstreams.
- Redis: shared cache + coordination; compromise impacts rate limits, locks, and dedupe.
- Postgres: source of truth for users, alerts, giveaways, journals, portfolios.
- Mitigations
- Per-IP webhook rate limiting (
/telegram/webhook). - Request-level rate limiting for chat interactions.
- Per-user backpressure locks for heavy actions (analysis/scans/news/etc.).
- Graceful degradation: serve cached results when upstream fetch fails.
- Auto-block repeated rate-limit violators (Redis-backed abuse strikes + temporary blocklist).
- Per-IP webhook rate limiting (
- Operator actions
- Run behind a reverse proxy/WAF if exposed publicly.
- Prefer webhook secret token (
TELEGRAM_WEBHOOK_SECRET) in production.
- Mitigations
/tasks/*,/metrics,/admin/statsrequire cron authorization (CRON_SECRET) or native Vercel cron header./test/*endpoints gated byTEST_MODE.
- Operator actions
- Set a strong
CRON_SECRETin staging/prod. - Disable
TEST_MODEin staging/prod.
- Set a strong
- Mitigations
- Redis distributed locks around scheduled tasks.
- Idempotency keys for alerts to prevent duplicates.
- Operator actions
- Ensure all instances share the same Redis.
- Monitor Redis availability; degraded Redis reduces safety controls.
- Mitigations
- Strict startup validation for required settings in staging/prod.
- Structured logging avoids dumping large objects; do not log secrets.
- Operator actions
- Never commit
.env. - Use managed secrets (Vercel env vars / GitHub secrets / secret manager).
- Rotate
TELEGRAM_BOT_TOKENandCRON_SECRETon suspected exposure.
- Never commit
- Mitigations
- DB access uses SQLAlchemy; avoid interpolating raw SQL.
- Intent/entity parsing is mostly deterministic; LLM router is optional and guarded.
- Operator actions
- Keep dependencies updated; run CI checks.
- Mitigations
- Resilient outbound HTTP policy: retries + backoff + per-host circuit breaker.
- Timeouts on outbound calls.
- Cached fallbacks for some features.
- Operator actions
- Prefer reliable upstream endpoints; consider running your own RPC where applicable.
ENV=prodSERVERLESS_MODE=trueon Vercel (disables polling + APScheduler).TELEGRAM_USE_WEBHOOK=truewith:TELEGRAM_WEBHOOK_URLTELEGRAM_WEBHOOK_SECRET(recommended)
CRON_SECRET(required for/tasks/*,/metrics,/admin/stats)TEST_MODE=false
- Prometheus metrics (restricted endpoint): request rates, latency, feature outcomes.
- Abuse metrics:
ghost_abuse_actions_total{action=...}. - Optional distributed tracing (OpenTelemetry) via
OTEL_ENABLED=true.
- Bot token exposed: rotate
TELEGRAM_BOT_TOKEN, revoke old token, redeploy. - Cron secret exposed: rotate
CRON_SECRET, invalidate old secret, redeploy. - Spam/flood: increase rate limits conservatively, enable/raise auto-block, add WAF rules.
- DB compromise suspected: rotate DB credentials, snapshot DB, review access logs, notify users if required.